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Abstract 

Denote by A the adjacency matrix of an Erdds-Renyi graph with bounded average degree. 
We consider the problem of maximizing (A — E{A},X) over the set of positive semidefinite 
matrices JC with diagonal entries Xu = 1. We prove that for large (bounded) average degree d, 
the value of this semidefinite program (SDP) is -with high probability- 2n\fd + n o{Vd) + o{n). 
For a random regular graph of degree d, we prove that the SDP value is 2n^/d — 1 + o(n), 
matching a spectral upper bound. Informally, Erdds-Renyi graphs appear to behave similarly 
to random regular graphs for semidefinite programming. 

We next consider the sparse, two-groups, symmetric community detection problem (also 
known as planted partition). We establish that SDP achieves the information-theoretically 
optimal detection threshold for large (bounded) degree. Namely, under this model, the vertex set 
is partitioned into subsets of size n/2, with edge probability a/n (within group) and h/n (across). 
We prove that SDP detects the partition with high probability provided (a—6)^/(4(i) > H-o^)!), 
with d = (a-|-6)/2. By comparison, the information theoretic threshold for detecting the hidden 
partition is (a — b)^/(4d) > 1: SDP is nearly optimal for large bounded average degree. 

Our proof is based on tools from different research areas: (*) A new ‘higher-rank’ Grothendieck 
inequality for symmetric matrices; (ii) An interpolation method inspired from statistical physics; 
(iii) An analysis of the eigenvectors of deformed Gaussian random matrices. 
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1 Introduction and main results 


1.1 Background 

Let G = iy,E) be a random graph with vertex set V = [n], and let Ac S {0, denote its 

adjacency matrix. Spectral algorithms have proven extremely successful in analyzing the structure 
of such graphs under various probabilistic models. Interesting tasks include finding clusters, com¬ 
munities, latent representations, collaborative filtering and so on [AKS98[lMcSr)ltrN.TW'*~r)2 ICODB] . 
The underlying mathematical justification for these applications can be informally summarized as 
follows (more precise statements are given below): 


If G is dense enough, then Aq — EjAc} is much smaller, in operator norm, than EjAc}. 

(Recall that the operator norm of a symmetric matrix M is ||i\T||op = max(^i(iW), —^„(iW)), 
with ^£{M) the I-th. largest eigenvalue of M.) 

Random regular graphs provide the simplest model on which this intuition can be made precise 
Denoting by d) the uniform distribution over graphs with n vertices and uniform degree d, we 

have, for G ~ G""^{n,d), EAg ~ {d/n)ll^, whence ||EAg ||2 ~ d. On the other hand, the fact that 
random regular graphs are ‘almost Ramanujan’ [Frin3] implies ||Ag —EAcHop < 2^/d — l-|-On(l) <C 
d. Roughly speaking, the random part Aq — EAq is smaller than the expectation by a factor 2jyfd. 

The situation is not as clean-cut for random graph with irregular degrees. To be definite, 
consider the Erdos-Renyi random graph distribution G{n,d/n) whereby each edge is present in¬ 
dependently with probability d/n (and hence the average degree is roughly d). Also in this case 
EAg ~ {d/n)ll^, whence ||EAg||op ~ d. However, the largest eigenvalue of Ag — EAg is of the 
order of the square root of the maximum degree, namely y^log n/(log log n) |KS03j . Summarizing 


||Ag - EAgIIop = 


2Vd^{l + o{l)) 

V^logn/(log logn)(l -h o(l)) 


iiG^Q‘‘^{n,d), 

if G ~ G(n, d/n). 


( 1 ) 


Further, for G ~ G(n, d/n), the leading eigenvectors of Ag — EAg are concentrated near to high- 
degree vertices, and carry virtually no information about the global structure of G. In particular, 
they cannot be used for clustering. 

Far from being a mathematical curiosity, this difference has far-reaching consequences: spectral 
algorithms are known fail, or to be vastly suboptimal for random graphs with bounded average 
degree [FOOSl ICOlOl IKMOlOl IDKMZlTl IKMM+ . The community detection problem (a.k.a. 
‘planted partition’) is an example of this failure that attracted significant attention recently. Let 
G{n,a/n,b/n) be the distribution over graph with n vertices defined as follows. The vertex set 
is partitioned uniformly at random into two subsets Si, S 2 with \Si\ = n/2. Conditional on this 
partition, edges are independent with 


E((f,i)eE|5i,52) 


a/n if {f, C 5i or {f, j} C 52, 

h/n if z G 5i, j e 52 or i £ 52, j S 5i. 


(2) 


Given a single realization of such a graph, we would like to detect, and identify the partition. 
Early work on this problem showed that simple spectral methods are successful when a = a(n), 
h = 6(n) —)■ 00 sufficiently fast. However Eq. ([1]) -and its analogue for the model G{n,a/n,b/n)- 
implies that this approach fails unless (a — 6)^ > dog n/log log n. (Throughout G indicates 
numerical constants.) 
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Several ideas have been developed to overcome this difficulty. The simplest one is to simply 
remove from G all vertices whose degree is -say- more than ten times larger than the average degree 
d. Feige and Ofek [FO05] showed that, if this procedure is applied to G ~ G(n, d/n), it yields a new 
graph G' that has roughly the same number of vertices as G, but \\Ag — ElAcIHop < Gyfd, with 
high probability. The same trimming procedure was successfully applied in |KMO10] to matrix 
completion, and in [COlOt ICRVlhj to community detection. This approach has however several 
drawbacks. First, the specific threshold for trimming is somewhat arbitrary and relies on the 
idea that degrees should concentrate around their average: this is not necessarily true in actual 
applications. Second, it discards a subset of the data. Finally, it is only optimal ‘up to constants.’ 

A new set of spectral methods to overcome the same problem were proposed and analyzed 
within the community detection problem [DKMZlTl lKMM^13[ IMNS131 IMasldl IBLM15t ?]. These 
methods construct a new matrix that replaces the adjacency matrix Aq, and then compute its 
leading eigenvalues/eigenvectors. We refer to Section [2] for further discussion. These approaches 
are extremely interesting and mathematically sophisticated. In particular, some of them have been 
proved to have an optimal detection threshold under the model G(n, a/n, 6/n) [MNS13( IMasl4( 
IBLM15] . Unfortunately they rely on delicate properties of the underlying probabilistic model. For 
instance, they are not robust to an adversarial addition of o(n) edges (see Section n. 

1.2 Main results (I): Erdos-Renyi and regular random graphs 

Semidefinite programming (SDP) relaxations provide a different approach towards overcoming the 
limitations of spectral algorithms. We denote the cone of n x n symmetric positive semidefinite 
matrice by PSD(n) = {X G : X ^ 0}. The convex set of positive-semidefinite matrices with 

diagonal entries equal to one is denoted by 

PSDi(n) = {Xe : X hO, Xii = IVf G [n]} . (3) 

The set PSDi(n) is also known as the elliptope. Given a matrix M, we defin^ 

SDP(M) = max{(M,X) : X G PSDi(n)} . (4) 

It is well known that approximate information about the extremal cuts of G can be obtained by 
computing SDP(Ag) [GW95| . 

The main result of this paper is that the above SDP is also nearly optimal in extracting informa¬ 
tion about sparse random graphs. In particular, it eliminates the irregularities due to high-degree 
vertices, cf. Eq. ([T|). Our first result characterizes the value of SDP(Ag' — ElAc}) for G an 
Erdos-Renyi random graph with large bounded degre^. (Its proof is given in Appendix [All 

Theorem 1. Let G ~ G{n,d/n) be an Erdos-Renyi random graph with edge probability d/n, Aq 
its adjacency matrix, and Ag" = Aq — ElAc} its centered adjacency matrix. Then there exists 
G = G{d) such that with probability at least 1 — G , we have 

-SDP(AgO = 2Vd + Od(Vd), -SDP(-AgU = 2Vd + Orf(Vd). (5) 

n n 

^Here and below {A,B) = Tr{A^B) is the usual scalar product between matrices. 

^Throughout the paper, O('), o(-), and 0(-) refer to the usual n —^ oo asymptotic, while Od{-), Od(-) and 
Qd( ■) are used to describe the d —>■ oo asymptotic regime. We say that a sequence of events B„ occurs with high 
probability (w.h.p.) if P(i3n) ^ 1 as n —>■ oo. Finally, for random {X^} and non-random / ; R>o —>■ R>o, we 
say that Xn = Od{f{d)) w.h.p. as n ^ oo if there exists non-random g{d) = Od{f{d)) such that the sequence 
Bn = {\Xn\ < g{d)} occurs w.h.p. (as n —>■ oo). 
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Note that SDP(Ag“) < n^i(Ag“) (here and in the following ^i{M) > ^ 2 {M) > 
denote the eigenvalues of the symmetric matrix iW). However, while is sensitive to vertices 

of atypically large degree, cf. Eq. ([T]), SDP(Ag“) appears to be sensitive only to the average degree. 
Intuitively, the constraint Xu = 1 rules out the highly localized eigenvectors that are responsible 
for Ri Y^logn/log logn. 

Another way of interpreting Theorem[T]is that Erdos-Renyi random graphs behave, with respect 
to SDP as random regular graphs with the same average degree. Indeed, we have the following 
more precise result for regular graphs. (See Appendix [Bl for the proof.) 

Theorem 2. Let G ~ G'’®®(n, d) be a random regular graph with degree d, and A^^^ = Ag-E{Ag} 
its centered adjacency matrix. Then, with high probability 

-SDP(Ag") = 2Vd-l + o„(l), -SDP(-Ag") = 2Vd - 1 + o„(l). (6) 

n n 

Remark 1.1. The quantity SDP(Ag") can also be thought as a relaxation of the problem of 
maximizing ^ {+1; Yl=i ~ 0. The result of our companion paper 

[DMSlSj implies that this has -with high probability- value 2nP*\/d + nodiVd) (see [DMSlSj for 
a definition of P*). We deduce that -with high probability- the SDP relaxation overestimates the 
optimum by a factor 1/P* + 0 ^( 1 ) (where 1/P* ~ 1.310). 

Remark 1.2. For the sake of simplicity, we stated Eq. ([5]) in asymptotic form. However, our proof 
provides quantitative bounds on the error terms. In particular, the OdiVd) term is upper bounded 
by log(d), for C a numerical constant. 


1.3 Main results (II): Hidden partition problem 

We next apply the SDP defined in Eq. Q to the community detection problem. To be definite we 
will formalize this as a binary hypothesis testing problem, whereby we want to determine -with high 
probability of success- whether the random graph under consideration has a community structure 
or not. The estimation version of the problem, i.e. the question of determining -approximately- a 
partition into communities, can be addressed by similar techniques. 

We are given a single graph G = {V, E) over n vertices and we have to decide which of the 
following holds: 

Hypothesis 0: G ~ G{n,d/n) is an Erdos-Renyi random graph with edge probability d/n, d = 
(o -|- b)/2. We denote the corresponding distribution over graphs by Pq. 


Hypothesis 1: G ~ G(n, a/n, b/n) is an random graph with a planted partition and edge probabilities 
a/n, b/n. We denote the corresponding distribution over graphs by Pi. 


A statistical test takes as input a graph G, and returns T{G) € {0,1} depending on which hypothesis 
is estimated to hold. We say that it is successful with high probability if Po(r(G) = l)+Pi(T(G) = 
0) —>• 0 as re —)• oo. 

Theorem [T] indicates that, under Hypothesis 0, we have SDP(Ag —(d/re)!!"'') = 2nVd+n Od{Vd). 
This suggests the following test: 


r(G;d) 


1 if SDP(Ag- (d/re)llT) > 2re(l + d)Vd, 
0 otherwise. 


(7) 
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Mossel, Neeman, Sly |MNS12j proved that no test can be successful with high probability if (a — 
b) < y^2(a + b). Polynomially computable tests that achieve this threshold were developed in 
|MNS13l IMasldl IBLMlSj using advanced spectral methods. As mentioned, these approaches can 
be fragile to perturbations of the precise probabilistic model, cf. Section [H 

Our next result addresses the fundamental question: Does the SDP-based test achieve the in¬ 
formation theoretic threshold? Notice that the recent work of |GV14j falls short of answering this 
question since it requires the vastly sub-optimal condition (a — 6)^ > 10^(a -|- b). (We refrer to 
Appendix E] for its proof.) 

Theorem 3. Assume, for some e > 0, 


a — b 
\/2(a -|- b) 


>l + e. 


( 8 ) 


Then there exists 6^ = (5*(e) > 0 and d* = d*(e) > 0 such that the following holds. If d = {a-\-h)/2 > 
d*, then the SDP-based test T{ ■ ;d*) succeeds with high probability. 

Further, the error probability is at most for C = C{a, b) a constant. 


Remark 1.3. This theorem guarantees that SDP is nearly optimal for large but bounded degree d. 
By comparison, the naive spectral test that returns TgpeciG) = 1 if Ai(Ag) > d* and Tspec(G) = 0 
otherwise (for any threshold value d*) is sub-optimal by an unbounded factor for d = 0(1). 

Remark 1.4. One might wonder why we consider large degree asymptotics d = (a -|- 6)/2 —>• oo 
instead of trying to establish a threshold at (a — b)/^2{a -|- b) = 1 for hxed a, b. Preliminary 
non-rigorous calculation |JMRT15] suggest that indeed this is necessary. For fixed (a -|- b) the SDP 
threshold does not coincide with the optimal one. 


Remark 1.5. For the sake of simplicity, we formulated the community detection problem as an 
hypothesis testing problem. A related (somewhat more challenging) task is to estimate the hidden 
partition better than by random guessing. In Section 14.11 we will show that, under the same 
conditions of Theorem [3l we can assign vertices making at most (1 — A)n/2 mistakes (with high 
probability for some A bounded away from 0). 

We will discuss related work in the next section, then provide an outline of the proof ideas 
in Section [3l and finally discuss extension of the above results in Section 01 Detailed proofs are 
deferred to the appendix. 


1.4 Notations 

Given n G N, we let [n] = {1,2,... ,re} denote the set of hrst n integers. We write |5| for the 
cardinality of a set S. We will use lowercase boldface (e.g. v = (ui,...,u„), x = {xi, .. ., Xn), 
etc.) for vectors and uppercase boldface (e.g. A = (Aj^j)j T = (yjj)jjgj„], etc.) for matrices. 
Given a symmetric matrix M, we let (^i(Af) > 6(M) > • • • > be its ordered eigenvalues 

(with ^max(AF) = ^mm(AF) = ^^(Af)). In particular 1„ = (1, 1,..., 1) G M"" is the all-ones 

vector. In the identity matrix, and e, G M” is the Fth standard unit vector. 

For V G ||i)||p = denotes its ip norm (extendend in the standard way to 

p = oo). For a matrix M, we denote by ||Af||p^g = sup^^g ll-^'*^ll(?/ll'^llg its ip-to-iq operator 
norm, with the standard shorthands HAfUop = ||A /'||2 = ||Af|| 2 ^. 2 - 
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Throughout with high probability means ‘with probability converging to one as n oo.’ We 
follow the standard Big-Oh notation for asymptotics. We will be interested in bounding error terms 
with respect to n and d. Whenever not clear from the contest, we indicate in subscript the variable 
that is large. For instance f{n,d) = 0^(1) means that there exists a function g{d) > 0 independent 
of n such that lim^^oo= 0 and \f{n,d)\ < g{d). (Hence f{n,d) = cos{0.1n)/d = 0^(1) but 
f{n,d) = log(n)/d / 0 ^( 1 ).) 

A random graph has a law (distribution), which is a probability distribution over graphs with 
the same vertex set F = [n]. Since we are interested in the n —)• oo asymptotics, it will be implicitly 
understood that one such distribution is specified for each n. 

We will use C (or Cq, Ci,...) to denote constants, that will change from point to point. Unless 
otherwise stated, these are universal constants. 

2 Further related literature 

Few results have been proved about the behavior of classical SDP relaxations on sparse random 
graphs and -to the best of our knowledge- none of these earlier results is tight. 

Significant amount of work has been devoted to analyzing SDP hierarchies on random CSP 
instances |Gri01[[Sch08| . and -more recently- on (semi-)random Unique games instances [KMMli] . 
These papers typically prove only one-side bounds that are not claimed to be sharp as the number 
of variables diverge. 

Coja-Oghlan |CO03| studies the value of Lovasz theta function ^{G), for G ~ G{n,p) a dense 
Erdos-Renyi random graph, estabilishing Gwjnjp < 'd{G') < G^y/njp with high probability. As in 
the previous cases, this result is not tight. 

Ambainis et al. [ABB~*~1^ study an SDP similar to ([4]), for M a dense random matrix with 
i.i.d. entries. One of their main results is analogous to a special case of our Theorem [5j (6) below 
-namely, to the case A = 0. (We prefer to give an independent -simpler- proof also of this case.) 

Several papers have been devoted to SDP approaches for community detection and the related 
‘synchronization’ problem. A partial list includes |BCSZ14| IABH141 IHWXl^ IHWXThj lABC'*~15] . 
These papers focus on finding sufficient conditions under which the SDP recovers exactly the un¬ 
known signal. For instance, in the context of the hidden partition model ([2|), this requires diverging 
degrees a,b = 0(logn) [ABH14[ IHWXldl IHWX15] . SDP was proved in |HWX14] to achieve the 
information-theoretically optimal threshold for exact reconstruction. The techniques to prove this 
type of result are very different from the ones employed here: since the (conjectured) optimum is 
known explicitly, it is sufficient to certify it through a dual witness. 

The only result on community detection that compares to ours was recently proven by Guedon 
and Vershynin [GV14] . Their work uses the classical Grothendieck inequality to establish upper 
bounds on the estimation error of SDP. The resulting bound applies only under the condition 
(a —6)^ > 10^(a-|-6). This condition is vastly sub-optimal with respect to the information-theoretic 
threshold (a —6)^ > 2{a + b) established in [MNS12(lMNS13( lMasl4] (and is unlikely to be satisfied 
by realistic graphs). In particular, the results of [GV14| leave open the central question: is SDP to 
be discarded in favor of the spectral methods of [MNS131 IMasl4| . or is the sub-optimality just an 
outcome of the analysis? 

In this paper we provide evidence indicating that SDP is in fact nearly optimal for community 
detection. While we also make use of a Grothendieck inequality as in |GV14| . this is only one step 
(and not the most challenging) in a significantly longer argument. Let us emphasize that the gap 
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between the ideal threshold at (a — b)/y/2{a + b) = 1, and the guarantees of [GV14| cannot be 
filled simply by carrying out more carefully the same proof strategy. In order hll the gap we need 
to develop several new ideas: (i) A new (higher rank) Grothendieck inequality; (ii) A smoothing 
of the original graph parameter SDP( •); (in) An interpolation argument; (iv) A sharp analysis of 
SDP for Gaussian random matrices. 


3 Proof strategy 

Throughout, we denote by Ag" = Aq — (d/n)ll^ the centered adjacency matrix of G ~ G{n,d/n) 
or G ~ G(n,a/n, b/n). Our proofs of Theorem [T] and Theorem [3] follows a similar strategy that can 
be summarized as follows: 


Step 1: Smooth. We replace the function AT i—)• SDP(AT), by a smooth function AT i— )• <I>(/3, /c; AT) 
that depends on two additional parameters /3 € M>o and k gN. We prove that, for /3, k large 
(and AT sufficiently ‘regular’), |SDP(AT) — <I>(/3, A:; AT)| can be made arbitrarily small, uni¬ 
formly in the matrix dimensions. This in particular requires developing a new (higher rank) 
Grothendieck-type inequality, which is of independent interest, see Section [3Tl 

Step 2: Interpolate. We use an interpolation method (analogous to the Lindeberg method) to 
compare the value d>(/3. A:; Ag“) to ^(l3,k;B), where B G is a symmetric Gaussian 

matrix with independent entries. More precisely, we use Bij ~ N(0,1/n) to approximate 
G ~ G{n,d/n) and Bij ~ N(A/n, 1/n) to approximate the hidden partition model G ~ 
G{n,a/n,b/n), with A = (a — 6)/y^2(a -|- b). Further detail is provided in Section [3]2j 

Note that the interpolation/Lindeberg method requires AT i—)• <h(/3, k; AT) to be differentiable, 
which is the reason for Step 1 above. 


Step 3: Analyze. We finally carry out an analysis of SDP(S) with B distributed according to the 
above Gaussian models. In doing this we can take advantage of the high degree of symmetry 
of Gaussian random matrices. This part of the proof is relatively simple for Theorem [H but 
becomes challenging in the case of Theorem [3l see Section 13.31 

(The proof of Theorem[2]is more direct and will be presented in Appendix [B)) . In the next subsections 
we will provide further details about each of these steps. The formal proofs of Theorem [T] and 
Theorem [3] are presented in Appendix lAl with technical lemmas in other appendices.. 

The construction of the smooth function ^(f3,k; M) is inspired from statistical mechanics. As 
an intermediate step, define the following rank-constrained version of the SDP ([3]) 

OPTfc(AT) = max{(AT,A:) : XGPSDi(n), rank(A) < A:} (9) 

n 

= max{ ^ Mij{cTi,(Tj) : <Ti G , (10) 


where ^ = {cr G R^ : ||cr ||2 = 1} be the unit sphere in k dimensions. We then define ^{(3, k; AT) 
as the following log-partition function 


$(/3, A:; AT) = ^ log |y exp |/3 ^ M^-(cTj, <Tj)| di/(cr) | . 


( 11 ) 
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Here a = (<ti, <T 2 , ..., <t„) G and we denote by dz^( •) the uniform measure on 

(normalized to 1, i.e. f dz^(cr) = 1). 

It is easy to see that lim^^oo ^(/3, k; M) = OPTfc(iW), and OPT„(iW) = SDP(Ai'). For carrying 
out the above proof strategy we need to bound the errors |<h(/3, k; M)—OPTk{M)\ and |OPTfc(iW) — 
SDP(iW)| uniformly in n. 


3.1 Higher-rank Grothendieck inequalities and zero-temperature limit 

In order to bound the error |OPTfe(iH) — SDP(iH)| we develop a new Grothendieck-type inequality 
which is of independent interest. 

Theorem 4. For k > 1, let g ^ N(0,lfc/fc) o, vector with i.i.d. centered normal entries with 
variance 1/k, and define = (E||gi|| 2 )^. 

Then, for any symmetric matrix M G we have the inequalities 


SDP(M) > OPTfc(M) > afcSDP(M) - (1 - Uk) SDP(-M), (12) 

OPJkiM) > (2 - a^i)SDP(M) - - l) OPTfc(-M). (13) 


Remark 3.1. The upper bound in Eq. ()12p is trivial. Further, it follows from Cauchy-Schwartz 
that Ok G (0,1) for all k. Also \\g \\2 is a chi-squared random variable with k degrees of freedom 
and hence 


2F((A: + 1)/2)2 
A:F(A:/2)2 




(14) 


Substituting in Eq. (jl2p we get, for all k > ko with ko a sufficiently large constant, and assuming 
SDP(M) > 0, 

(l - i)SDP(M) - i |SDP(-M)| < OPTfc(M) < SDP(M). (15) 

\ K / K 


In particular, if |SDP(—AF)| is of the same order as SDP(AF), we conclude that OPT^{M) approx¬ 
imates SDP(AF) with a relative error of order 0(1/A:). 


The classical Grothendieck inequality concerns non-symmetric bilinear forms [Gro96] . A 
inequality for symmetric matrices was established in |NRT99l Meg01| (see also [AMMN06] 
eralizations) and states that, for a constant C, 


Grothendieck 
for gen- 


OPTi(M)>——SDP(M). (16) 

C log n 

Higher-rank Grothendieck inequalities were developed in the setting of general graphs in [BrilOl 
IBdOEVTIi] . However, constant-factor approximations were not established for the present problem 
(which corresponds to the the complete graph case in |Bri M)- 

Gonstant factor approximations exist for M positive semidefinite |BdOEVTO] . We note that 
Theorem [3] implies the inequality of [BdOFV ill] . Using SDP(—AF) < —^min(AF) in Eq. (fl^ . 
we obtain the inequality of [BdOFVTO] for the positive semidefinite matrix M — ^jnin(AF)I. On 
the other hand, the result of [BdOEVTH] is too weak for our applications. We want to apply 
Theorem 0] -among others- to M = Ag" with Ag" the adjacency matrix of G ~ G{n,d/n). 
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This matrix is non-positive definite, and in a dramatic way with smallest eigenvalue satisfying 
-Cmm(AG“) ~ (logn/(loglogn))^/2 > SDP(-Ag“)). 

In summary, we could not use the vast literature on Grothendieck-type inequality to prove our 
main result. Theorem [H which motivated us to develop Theorem 01 

Theorem0]will allow to bound |SDP(iW) —OPTfc(iW)| for M either a centered adjacency matrix 
or a Gaussian matrix. The next lemma bounds the ‘smoothing error’ k\ M) — OPTfc(iW)|. 

Lemma 3.2. There exists an absolute constant C such that for any e G (0,1] the following holds. 
If ||iW||oo^2 = max{||AT cc ||2 : ||®||cxd < 1} < L^/n, then 


- -OPJk{M) 
n n 


< 2LsVk -P -I log — 
P e 


(17) 


3.2 Interpolation 

Our next step consists in comparing the adjacency matrix of random graph G with a suitable 
Gaussian random matrix, and bound the error in the corresponding log-partition function d>(/3, k; ■). 

Let us recall the definition of Gaussian orthogonal ensemble GOE(re). We have W ~ GOE(n) 
if LL G is symmetric with independent, with distribution Wu ~ N(0, 2/n) and 

N(0,1/re) for i < j. We then define, for A > 0, the following deformed GOE matrix; 

B{X) = -11^ + W, (18) 

re 

where W ~ GOE(re). The argument A will be omitted if clear from the context. The next lemma 
establishes the necessary comparison bound. Note that we state it for G ~ G{n,a/b, b/n) a random 
graph from the hidden partition model, but it obviously applies to standard Erdos-Renyi random 
graphs by setting a = b = d. 

Lemma 3.3. Let Ag" = Aq — (d/re)!!"’" be the centered adjacency matrix of G ^ G{n,ajn,bfn), 
whereby d = {a + b)/2. Define X = {a — b)j2^/d. Then there exists an absolute constant reg such 
that, if n> max(reo, (15d)^), 


^E^{P,k-,A^(r/Vd) 


re 


2/32 8AI/2 


(19) 


Note that this lemma bounds the difference in expectation. We will use concentration of measure 
to transfer this result to a bound holding with high probability. 

Interpolation (or ‘smart path’) methods have a long history in probability theory, dating back to 
Lindeberg’s beautiful proof of the central limit theorem [Lin22] . Since our smoothing construction 
yields a log-partition function <I>(/3, A:; M), our calculations are similar to certain proofs in statisti¬ 
cal mechanics. A short list of statistical-mechanics inspired results in probabilistic combinatorics 
includes [ELOSl IFLTOSt IBGT13t IPTOdt IGTOd] . In our companion paper |DMS15j , we used a simi¬ 
lar approach to characterize the limit value of the minimum bisection of Erdos-Renyi and random 
regular graphs. 

















3.3 SDPs for Gaussian random matrices 


The last part of our proof analyzes the Gaussian model (IlSp . This type of random matrices have 
attracted a significant amount of work within statistics (under the name of ‘spiked model’) and 
probability theory (as ‘deformed Wigner -or GOE- matrices’), aimed at characterizing their eigen¬ 
values and eigenvectors. A very incomplete list of references includes [BBAPOSl IFP071 ICDMF~*~11 
IBGGMl^ IBV131IPRS131 IKY13| . A key phenomenon unveiled by these works is the so-called Baik- 
Ben Arous-Peche (or BBAP) phase transition. In its simplest form (and applied to the matrix of 
Eq. (I18h ) this predicts a phase transition in the largest eigenvalue of B{X) 


lim ^i{B{X)) 

n^oo 


2 if A < 1, 

A + A-i ifA>l. 


( 20 ) 


(This limit can be interpreted as holding in probability.) Here, we establish an analogue of this 
result for the SDP value. 


Theorem 5 (SDP phase transition for deformed GOE matrices). Let B = B{X) G be a sym¬ 

metric matrix distributed according to the model hlkA) . Namely B = B^ with {Bij}i<j independent 
random variables, where Bij ~ N(A/n, 1/n) for 1 < i < j < n and Bu ~ N(A/ra, 2/n) for 1 <i <n. 
Then 


(a) If X ^ then for any e > 0, we have SDP(B(A))/n G [2 — e,2 -|- e] with probability 

converging to one as n —)• oo. 

{b) // A > 1, then there exists A(A) > 0 such that SDP(H(A))/n > 2 A(A) with probability 
converging to one as n ^ oo. 

As mentioned above, we obviously have SDP(H)/n < f,i{B). The first part of this theorem 
(in conjunction with Eq. (1201) 1 establishes that the upper bound is essentially tight of A < 1. 
On the other hand, we expect the eigenvalue upper bound not to be tight for A > 1 [.TMRTI^j . 
Nevertheless, the second part of our theorem establishes a phase transition taking place at A = 1 
as for the leading eigenvalue. 

Remark 3.4. The phase transition in the leading eigenvalue has a high degree of universality. In 
particular, Eq. (j20p remains correct if the model (jISp is replaced by B' = Ar;u''' -|- W, with v an 
arbitrary unit vector. On the other hand, we expect the phase transition in SDP(R')/n to depend 
-in general- on the vector v, and in particular on how ‘spiky’ this is. 


4 Other results and generalizations 

While our was focused on a relatively simple model, the techniques presented here allow for several 
generalizations. We discuss them briefly here. 

4.1 Estimation 

For the sake of simplicity, we formulated community detection as an hypothesis testing problem. It 
is interesting to consider the associated estimation problem, that requires to estimate the hidden 
partition V = Si U S 2 . 
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We encode the ground truth using the vector Xq G {+1,-1}"', with xo,i = +1 if i G and 
xo,i = —1 if i G S 2 - An estimator is a mapH x : Gn ^ {+!) 0; —1}” with Gn the space of graphs over 
n vertices. It is proved in |MNS12j that no estimator is substantially better than random guessing 
for G ^ G{n,a/n,b/n), with A = (a — b)/y^2{a + b) < 1. More precisely, for A < 1, any estimator 
achieves vanishing correlation with the ground truth; |(£(G),a;o)| = o(n) with high probability. 

We construct a randomized SDP-based estimator as follows (we will denote expectation 

and probability with respect to tha algorithm’s randomness by Eaig( •) and Paig( • ))• 

(i) Partition the edge set E = E 1 UE 2 by letting (i, j) G E 2 independently for each edge {i,j) G -Ej 
with probability Paig((i, j) G E 2 ) = dn/i^ + Sn), Sn = and {i,j) S Ei otherwise. Denote 

by Gi = (y,Ei), and G 2 = (P, E 2 ) the resulting graphs. 

(a) Compute an optimizer W* of the SDP (jl]), M = (i.e. a matrix G PSDi(n) such that 

= SDP(Ag“)). 

(in) Compute the eigenvalue decomposition X* = and let Vi = ^*^ 2 ) ■ ■ ■ j 

denote the z-th eigenvector. For each z, j G [n] define G {+1,0, —1} by = sign(uj)£ 
if \vi^e.\ > = \vij\ and = 0 otherwise. (In words, is obtained from Vi by zeroing 

entries with magnituude below \vij\ and taking the sign of those above). 

(iv) Select (/, J) = arg maXjjgj„](S(*’'^), and return ®®°^(G) = 

The next results implies that -for large bounded average degree d— this estimator has a nearly 
optimal threshold. 

Theorem 6. Let G ~ G(n, a/n, b/n) and assume, for some e > 0, A = (a — b)/y^2{a + b) > 1 + e. 
Then there exists = Aegt(e) > 0 and d* = d*(e) > 0 such that, for all d > d*(e) 

p(^i|(£^^^(G),a^o)| > Agg,(e)) (21) 

with P( •) denoting expectation with respect to the algorithm and the graph G, and C = G{e) a 
constant. 

4.2 Robustness 

Consider the problem of testing whether the graph G has a community structure, i.e. whether 
G ~ G{n,ajn,bfn) or G ~ G{n,d/n), d = {a + b)/2. The next result establishes that the SDP- 
based test of Section [T3] is robust with respect to adversarial perturbations of these models. Namely, 
an adversary can arbitrarily modify o{n) edges of these graphs, without changing the detection 
threshold. 

Corollary 4.1. Tet Pq the law of G ^ G{n,d/n), and Pi be the law of G ^ G{n,a/n,b/n). Denote 
by Pq, Pi be any two distributions over graphs with vertex set V =^n]. Assume that, for each 
a G {0,1}, the following happens: there exists a coupling Qa o/Pa andPa such that, if{G,G) ~ 
then |£'(G)A£'(G)| = o{n) with high probability. 

Then, under the same assumptions of Theorem\^ the SDP-based test & distinguishes Pq from 
Pi with error probability vanishing as n —?> 00 . 

^Earlier work sometimes assumes x : Qn ^ {+1,-1}", i.e. forbids the estimate 0. For our purposes, the two 
formulations are equivalent: we can always ‘simulate’ Xi = 0 by letting Xi G {+1, —1} uniformly at random. 
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By comparison, spectral methods such as the one of [BLMlSj appear to be fragile to an adver¬ 
sarial perturbation of o{n) edges [.TMRTl,^ . 

4.3 Multiple communities 

The hidden partition model of Eq. ([2]) can be naturally generalized to the case of r > 2 hidden 
communities. Namely, we define the distribution Gr{n, a/n, b/n) over graphs as follows. The vertex 
set [n] is partitioned uniformly at random into r subsets Si, S 2 , ■ ■ ■, Sr with \Si\ = n/r. Conditional 
on this partition, edges are independent with 

™ ^ r ^ ^ \ I a/n if ji, 7 ) C S'* for some £ G [rl, , , 

¥i{ii,j)€E\{Se}i<r) = {/. (22) 

I b/n otherwise. 

The resulting graph has average degree d = \a + {r — \)b]/r. The case studied above (hidden 
bisection) is recovered by setting r = 2 in this definition: G{n,a/n,b/n) = G 2 {n,a/n,b/n). Of 
course, this model can be generalized further by allowing for r unequal subsets, and a generic r x r 
matrix of edge probabilities |HLL83[ IAS151IHWX15] . 

Given a single realization of the graph G, we would like to test whether G ~ G(n, d/n) (hypoth¬ 
esis 0), or G ~ Grin, a/n, b/n) (hypothesis 1). We use the same SDP relaxation already introduced 
in Eq. Q, and the test T(- ;(5) defined in Eq. ([7]). This is particularly appealing because it does 
not require knowledge of the number of communities r. 

Theorem 7. Consider the problem of distinguishing G ~ Grin,a/n,h/n) from G ~ Gin,d/n), 
d = ia + ir — l)b)/r. Assume, for some e > 0, 

^ > 1 + e . (23) 

^r(a + ir - 1 ) 6 ) 

Then there exists h* = h*(e,r) > 0 and d* = d*(e,r) > 0 such that the following holds. If d > d„, 
then the SDP-based test T(-;d*) succeeds with error probability probability at most Ge“”/^ for 
G = Gia,b,r) a constant. 

Remark 4.2. In earlier work, a somewhat tighter relaxation is sometimes used, including the 
additional constraint Xij > —ir — 1)“^ for all i j. The simpler relaxation used here is however 
sufficient for proving Theorem [7l 

Remark 4.3. The threshold established in Theorem [7] coincides (for large degrees) with the one of 
spectral methods using non-backtracking random walks [BLM15j . However, for A: > 4 there appears 
to be a gap between general statistical tests and what is achieved by polynomial time algorithms 

[DKMZIlllCTrd] . 
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A Proofs of Theorem [T] and Theorem [3] (main theorems) 

In this Section we prove Theorem [Hand Theorem [3] using Theorems 01E] and Lemmas 13.2113.31 The 
proofs of the latter are presented in Appendices ICllDl lEllFllGl 

We begin by proving a general approximation result, and then obtain Theorem [T] and Theorem 
[3] as consequences. 


A.l Three technical lemmas 


Lemma A.1. Let G ^ G{n,a/n,b/n), d = {a + b)/2, and Aq' = Aq — {d/n)ll^ be its centered 
adjacency matrix. For A £ M jixed, define B = B{X) to be the deformed GOE matrix in Eq. LTR) . 
Then, there exists a universal constant G such that, for either M £ {AQ^/y/d,B{X)}, for all 


P{ $(/3,fc;M) -E$(/3,A:;M) > nt} < 

(24) 

Proof. Dehne the following Gibbs probability measure over (S^”^)"", which 
to the free energy <h: 

is naturally associated 

exp(/3iLM(cr)) 

J exp(^i7M(T))di^(T) 

(25) 

= (cr,Mcr) = ^ Mij{cri,crj) . 

(26) 


i,j=i 


It is a straightforward exercise with moment generating functions to show that 

= fiM{{(Ti,cTj)), (27) 

where fiM{f{(^)) denotes the expectation of /(cr) with respect to the probability measure fiM- In 
particular, since \{ai,crj)\ < 1 (here || • II 2 denotes the vector £2 norm) 




n 


E 


( 9 ^> 


2 

<n\ 


(28) 


This implies Eq. (j24p for M = B hy Gaussian isoperimetry (with constant (7 = 4). 

For M = A'fifi the proof is analogous. Let G be a graph that does not contain edge (i,j), and 
G~^ denote the same graph, to which edge {i,j) has been added. Then writing the dehnition of 
<I>( • • •), we get 

^[fi,k;A-\/Vd)-^[fi,k;A‘^S-/Vd) = ^ log {/ta-} . (29) 

In particular 


$(^,fc;A^JlI)-$(/3,A:;Ag") 


< 




(30) 


The claim then follows from a standard application of the ‘method of bounded differences’ |BLM13) 
i.e. from Azuma-Hoeffding inequality, whereby we construct a bounded differences martingale with 
a number of steps equal to a sufficiently large constant times the number of edges, e.g. lOdre. □ 
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Lemma A.2. Let G ~ G{n,a/n,b/n), d = {a + b)/2, and Ag" = Aq — (d/n)!!"'' be its centered 
adjacency matrix. Then there exists a universal constant C such that, for any t >0 


'||SDP(Ag") -ESDP(Ag")| > nij < 


(31) 


Proof. Let G be a graph that does not contain edge {i,j), and G+ denote the same graph, to which 
edge (i,j) has been added. Let X G PSDi(n) be an optimizer of the SDP with data Ag“, i.e. a 
feasible point such that (Ag",X) = SDP(Ag“). Then 


SDP(Ag!l) > (Ag;f,X) 

= (Ag“,X)+A,, 
> SDP(Ag") - 1, 


(32) 

(33) 

(34) 


where we used the fact that X is positive semidefinite to obtain \Xij \ < ^jXaXjj = 1. Exchanging 
the role of G and G~^, we obtain 


|SDP(Agll) - SDP(Ag")| < 1, 


(35) 


As in the previous lemma, the claim follows from an application of the ‘method of bounded dif¬ 
ferences’ |BLM13j i.e. from Azuma-Hoeffding inequality (we can apply this to a martingale with 
a number of steps proportional to the expected number of edges, say lOdn, whence the claimed 
probability bound follows). □ 

Lemma A.3. Let Aq', B be defined as in Lemma 1^.il Then, there exists an absolute constant 
C > 0 such that the following holds with probability at least 1 — G e~^^^: 

||^g"IIoo^2 ^ Gd'/n , ||S||oo^2 ^ (G -|- X)\/n 

Proof. For B we use (letting ||M'||2^.2 = ||Af||op = max(Ai(M'), —|A„(iW)|)): 

||-B||oo^>2 < y/n\\B\\2^2 < y/n[\ + ||VE||2^2) 


< (G -p X)^/n , 


(36) 

(37) 

(38) 


where the last inequality holds with the desired probability by standard concentration bounds on 
the extremal eigenvalues of GOE matrices |AGZ09| [Section 2.3]. 

For A'q', first note that 

^ (39) 


|^g“IIoo^2 < II^gIIoo^- 2 + — ||ll'''||oo^2 < ||^g||oo^-2 + 


n 


n 


|ll'''||2^2 


< ||Ag||oo ^2 + dy/n. 


(40) 


Next we observe that cr i—)■ || AgctU^ is a convex function on ||<t||oo < 1, and thus attains it maxima 
at one of the corners of the hypercube [—1,1]"". In other words, || Ag ||^^2 = II^Gcrlli- 

For a G {-Pl, —1}"', we get 


I^G'^112 — 




(41) 


2 = 1 


where degQ(i) is the degree of vertex i in G. The desired bound follows since deg(i)^ < G^d^n 
with the desired probability for some constant Go large enough (see, e.g. (JLROO]). □ 
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A.2 A general approximation result 


Theorem 8. Let G ~ G{n,a/n,h/n), d = {a + b)/2, and Aq" = Ac — (d/n)ll'^ be its centered 
adjacency matrix. Let A = {a — b)/-\j2{a + b) and define B = B{X) to be the deformed GOE matrix 
in Eq. m- Then, there exists G = C'(A) such that, with probability at least 1 — Ge for all 
n > no(a, b) 


^SDP(A^") - -SDP(B(A)) 
nVd n 

^SDP(-Ag") - -SDP(-B(A)) 
nv d ^ 


< 


Clogd 
dVio ’ 


< 


Clogd 

~dfr^ 


Further G{X) is bounded over compact intervals X G [0, Amax] 


(42) 

(43) 


Proof. Throughout the proof G = C'(A) is a constant that depends uniquely on A, bounded as in 
the statement, and we will write ‘for n large enough’ whenever a statement holds for n > no(a, b). 
First notice that by Lemma 13.31 and Lemma lA.ll we have, with probability larger than 1 — 
and all n large enough. 


A‘sys/d) 


i<f((/3,t;B(A)) 


4/32 10 A ^/2 

< 4- 

Vd 


(44) 


Next, by Lemma 13.21 and Lemma lA.31 with the same probability, for M G {Ag"/\/d, S(A)}, and 
/3,d > I 


i$(ft;=;M)-i0PU(M)|<^l„g(^«4±L) 


(45) 


(where we optimized the bound of Lemma [3. 2 1 over e.) Using triangle inequality with Eq. ()44p . and 
optimizing over /3, we get, always with probability at least 1 — Ge~'^^^, 


^OPTfc(Ag“) - -OPJkiBiX)) 
ny/d n 


(^ + ^) 


Proceeding the same way (with j3 replaced by —fi), we also obtain 


^OPTfc(-AS’’) - -OPTfc(-S(A)) 
ny/d n 




(46) 


(47) 


Since |OPTfc(—S)|, |OPTfc(—S)| < n||S||op < Gn with probability at least 1 —Ce we get also 


\op 

max <! -OPTfc(±S),^OPTfc(±Ag“) \<G, 
n ny/d 


whence, using Theorem [U we obtain 

1 


1 


SDP(Ag") - OPTfc(Ag- ^ 

ny/d ny/d 

isDP(S)-i0PTfc(S: 

n n 


cen \ ^ 

- 1’ 


< 


C 

c 


(48) 

(49) 

(50) 


The claim (1421) follows from using this, together with Eq. (1461) and triangular inequality. Equation 
(j43l) follows from exactly the same argument. □ 
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A.3 Proofs of Theorem [T] 

Applying Theorem [5] to A = 0 (whence B{\) = W ~ GOE(n)), we get, with high probability, 

isDP(fT), -SDP(-VP) e \2-d-\2 + d-^] . (51) 

n n 

(The claim for —W follows because —W ~ GOE(n)) Using Theorem [8l applied to a = b = d 
(whence G ~ G(n, d/n)), we have, with high probability 


ny/d 


SDP(Ag“ 


n 


yfd 


SDP(-Ag") G 


Clogd ^ Clogd 
di/io ’ + dVio 


(52) 


This implies that desired claim ([5]) holds with high probability. By the concentration lemma lA. 2 1 
(with a = 6 = d) it also holds with probability at least 1 — C{d)e~"‘/'^^‘^\ 


A.4 Proofs of Theorem [3] 

Recall -throughout the proof- that A = (a — h)/yj2{a + h) > 1 + e and d = (a + h)/2. Further, 
without loss of generality, we can assume A G [0, Amax] with A m ax > 1 fixed (e.g. Amax = 10^). 

Recall that Pq denotes the law of G ~ G(n,d/n) and Pi the law of G ~ G(n, o/n, 6 /re). We 
can control the probability of false positives (i.e. declaring G to have a two-communities structure, 
which it has not) using Theorem [H For any d > 0, we have 

lim Po(T(G;d) = 1) = lim Pq f-SDP(Ag") > 2(1 + d) Vd) = 0 , (53) 

where the last equality holds for any d > do(d). 

We next bound the probability of false negatives. Let A( •) as per Theorem [5j By Theorem [HI 
there exists dg = dQ(e) such that, for all d > dQ(e), with high probability for G ~ G(re, a/re, 6 /re), 

^SDP(Ag") > isDP(B(A)) - iA(1 + e) (54) 

reVd n 4 

>-SDP{B{l + e))-]A{l + e) (55) 

re 4 

>2 + ^A(l + e), (56) 

where the second inequality follows because SDP(S(A)) is monotone non-decreasing in A and the 
last inequality follows from Theorem [5j 

Selecting d*(e) = A (1 -|- e )/2 > 0, we then have 

lim Pi(r(G;d*(e)) = 0) = lim Pif^SDP(Ag“) < 2(l + d*(e))) (57) 

n—)-oo n^oo ^Tl\d ' 

= lim Pif^SDP(Ag“) < 2 + A*(l + e)) =0, (58) 

rn-oo \ny/d ' 

where the last equality follows from Eq. (1561) . 
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We proved therefore that the error probability vanishes as n —oo, provided d > d^{e) = 
max((io(<5*(e)), do(^))- argument also implies (eventually adjusting d*) 


lim Po( 

n—)-oo ' 

'^SDP(Ag“) 

'uYd 


= 0, 

(59) 

lim Pi 1 

n—^oo 

(^SDP(Ag“; 

+ 

VI 

= 0. 

(60) 


It then follows from the concentration lemma IA.2I that these probabilities (and hence the error 
probability of our test) are bounded by C for C = C{a,b) a constant. 

B Proof of Theorem [2] (SDP for random regular graphs) 

Recall that SDP(iW) < n^i(iW). Further, the leading eigenvector of a d-regular graph is the the 
all-ones vector vi = Xj^fn. Using this remark together almost-Ramanujan property of random 
d-regular graphs [Fri03] . we have, with high probability, 

1SDP(AS") < 6(Ar) = 6 (Ag) = + On(l), (61) 

This gives us the required upper bound. 

To derive a matching lower bound, we construct explicitly a feasible point of the optimization 
problem which asymptotically attains this value as n ^ oo. 

To this end, let denote the infinite d-regular tree with vertex set V{T(i). Csoka et. al. 
[CGHVlSl Theorem 3,4] establish that for any A with |A| < d, there exists a centered Gaussian 
process indexed by the vertices of T^, {Z^ : v G V{Td)}, such that with probability 1, for all 
V G V{Td), 

Zu = AZ^, (62) 

u^N{v) 

where N(u) denotes the neighbors of m G V{Td). These processes are referred to as “Gaussian wave 
functions”. Further, Csoka et. al. prove that for any |A| < 2\/d — 1, the process {Z^ : v G V{Td)} 
can be approximated by linear factor of i.i.d. processes. More explicitly, let {Xy : v G U(Trf)}, 
a collection of i.i.d. standard Gaussian Yy ~ N(0,1), then there exists a sequence of coefficients 
{«£}£>0) G M such that the Gaussian wave function {Zy : v G V{Td)} can be constructed so that 


lim E < (Zy- 

L —)'Cxo ^ \ 

-zW)'|=o, 

(63) 

a 

III 

aeYu . 

(64) 


e=0 ueV{Tdy.d{u,v)=£ 


(Here d( •, •) is the usual graph distance.) 

We use this construction with A = 2y/d — 1 — e for e a small positive number. Without loss of 
generality, we assume that Var(Z^) = 1 for all v G V{Td). It is easy to see |CGHVI5l Equation 2] 
that for u,v G V{Td) such that {u,v) G E{Td), we have 

E{Z„Z4 = 
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Thus, denoting by dv the set of neighbors of vertex n, E{ZuZ„} = 2y/d — 1 — e. By Eq. (1631) 

there exists L = L{e) large enough so that 


> 2Vd^ - 2e . 

uGdv 


(65) 


Let G ~ G'''®(n, d) be a random d-regular graph on n vertices. We use the above construction 
to obtain a feasible point of the SDP, X G PSDi (n), with the desired value. Namely, let {Y^ : v G 
P(G)} be a collection of i.i.d. random variables ~ N(0,1), independent of the graph G. We 
define {Z^ : v G V{G)} using the same coefficients as above: 


= E E 

k=0 u£V{G):d(u,v)=k 


( 66 ) 


We then construct the matrix X = {X-, 


ij 


by letting 




E{zYzY\G} 


^E{(zP)2|G}E{(zf))2|G} 


(67) 


It is immediate to see from the construction that X G PSDi(n) is a feasible point. 
At this feasible point, 


E{zYzY\G] 


d 


E 




— X') = — ^ ^ ^ ^^ 

” ^ ^jE{{z\Y^\G}E{{zYf\G} ^EU^EWEppm 

( 68 ) 


Since G converges almost surely as n —>■ oo to a d-regular tree (in the sense of local weak convergence, 
see, e.g. |nM+in] i. andzY is only a function of the L-neighborhood of i, we have, G-almost surely 


1 


lim — 

n^oo 77, 


E{z\^^zY\G} 


E 




^mG)^ e{{zYy\g}e{{zYy\g} uGdv yJniZXfmXPn 


Also, since = 0 , whenever d{i,j) > 2L, we have 


> 2Vd- 1 - 2e . 


(69) 


lim — y^ 

n—)-oo 77 ,^ 


E{z\^^zY\G} 


jmc) ^JE{{Z^Y?\G}E{{ZY?\G} 


= 0 . 


We conclude by noting that 


lim -SDP(Ag") > lim -(Ag“,X) > 2Vd- 1 - 2e, 

n^oo 77 , n^oo Ji 


and the thesis follows since e is arbitrary. 
The proof for — Ag" is exactly the same. 


(70) 


(71) 
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C Proof of Theorem |4] (Grothendieck-type inequality) 

As mentioned already, the upper bound in Eq. (|12p is trivial. The proof of the lower bound follows 
Rietz’s method |Rie74j . 

Let X be a solution of the problem ([4]) and through its Cholesky decomposition write Xij = 
with E M”', ||(Tj||2 = 1. In other words we have, letting M = 


SDP(M) = ^ . 


(72) 


Let J G be a matrix with i.i.d. entries J, ij N(0, l/k). Define, Xi G for i G [n], by letting 


J cr, 


Xi = 


(73) 


We next need a technical lemma. 

Lemma C.l. Let u,v ^ M"’ with ||u||2 = H'l’lb = 1 J G be defined as above. Further, 

for w G M”, let z(w) = (1 — ^^'^\\Jw\\f^)Jw. Then 

Ju Jv 


E 


= ak{u, v) + ak'E{z{u),z{v)) . 


(74) 


JJu||2’ llJ't’b' 

Proof. Let gi,g 2 N(0,lfc//c) be independent vectors (distributed as the first two columns of J. 
Let a = {u, v) and b = \/l — a"^. Then by rotation invariance 

E{Ju,Jv) = E{gi,agi + c/2) = aE{\\gi\\l) = {u,v ), 


and 


lE( ii ,Jv) =E( ,agi + g2) 

= aEdlfinb) = al^^(u,v}. 

By expanding the product we have 

E(z(u),z(v)} = (u,v} - a^^''^E( ,Jv) - af^^'^E{Ju, ) + —E( 

, , 1 Ju Jv . 

= -{u,v) + —E( r II , II y II ) 

oik \\Ju \\2 llJi’lb 

which is equivalent to the statement of our lemma. 

Now, by definition of the xfs we have 


(75) 

(76) 

(77) 


E 


n n 


MiiEi 




^j = l 


I J 112 11 ^'^j 112 

n 


ak ^ Mij{ui,Uj) + ctk'^ MijE{z{ui), z{uj)) 
i,j=l i,j=i 

n 

afcSDP(M) + Ofc ^ MijE{z{ui), z{uj)). 

*j=i 


(78) 

(79) 

□ 

(80) 
(81) 
(82) 
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Now we interpret z{ui) as a vector in a Hilbert space with scalar product IE( •, •). Further by the 
rounding lemma lC.il these vectors have norm 

E(lk(-i)lli) = --l. (83) 

CXfc 

Hence, by definition of SDP( •), we have 

n . 

- ^ Mij¥.{z{ui),z{uj)) < (-l)SDP(-M). (84) 

Substituting this in Eq. (1821) . we obtain 

n 

OPTfc(M) >^{Y1 > afcSDP(M) - (1 - afc)SDP(-M), (85) 

*.j=i 

which coincides with the claim (I12|) . 

In order to prove Eq. m, we apply Eq. (11211 to —M, thus getting 

SDP(-M) < —OPTfc(-M) + ^ ~ SDP(M). (86) 

Oik Oik 

Substituting this in Eq. m, we obtain Eq. m- 

D Proof of Lemma 13.21 (zero-temperature approximation) 

Define the objective function Hm '■ (S^”^)"" —>■ M 

n 

Hm{o-) = {a,Ma) = ^ Mij{(Ti,(Tj) . (87) 

(In the first expression that (•, •) denotes the scalar product between matrices and we interpret a 
as a matrix a G Let cr* G argmaxj77^(0") : (S^“^)"'}. We then have (denoting by || • \\f 


the Frobenius norm): 

<\{a-a*,Ma)\+\{(T-a*,M(T*)\ (88) 

< 2 max{||iW<T||p’, ||iW<T*||^} ||<t — cr*||^ (89) 

<2v^||M||oo^2||o--cr*||^. (90) 

Define the partition function 

Z{l3,k]M) = j exp {/3iLM(o')} di^(cr), (91) 

so that, in particular ^{(3, k; M) = (1//3) log Z(/3, A:; iW). By the above bound, and recalling 
L > ||M||oo^2/v^ 

f exp(-2/3L\/^ ||cr - a* Hj.) di/(cr). (92) 
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For any e > 0, we have (here I( •) denotes the indicator function) 

/ exp(—2/3 L\/^||<t — (T* IIj?) di^(<T) > / exp(—2/3 L\/^||<t — cr* ||p’)I(max ||crj — <t* ||2 < e) di^(<T) 

J J ie[n] 

> exp(—2/3Lne\/fc) (Vfc(e))"', (93) 

where Vfc(e) is volume of the spherical cap {<ti £ — <ti ||2 < e} (with respect to the 

normalized measure on the unit sphere S^“^). By a simple integral in spherical coordinates have 


Vfc(e) = {1/2)¥{X <e^ - (e^/d)} where X ~ Beta( 


fc-i 1 
2 ’ 2 


). Further 


X < £2 _ £_ ) > 


1 




4 y - 


V 


(94) 


Plugging this Eq. 


Beta(^, Jo 

I, we obtain (since OPTfc(iVf) = 

g/30PT,(M) > k; M) > ePOPT.m-mnsVk ( 95 ) 

Taking logarithms yields the desired bound (fT71) . 

E Proof of Lemma 13.31 (interpolation) 

Throughout this proof, we will fix, without loss of generality Si = {1,... n/2} and 52 = {(n/2) + 
1,..., n}. Define t; £ M” by letting Vi = ll^/n if z £ 5i and Vi = —Ij^/n if z £ 82 - Define 

= Xvv'^+ W . (96) 

(We will drop the argument A when clear from the context.) By a change of variables in the 
definition of 4>(/3, fc; •) (namely, cTj —)■ —for z £ 52 ), and since Wij = —Wij, we have 

E4>(;8, k; B{X)) = E4>(^, k; B"'='"(A)) . (97) 


We can and will therefore replace B{X) by S“®"'(A). We will drop the superscript ‘new.’ 
We proceed in two steps, and define an intermediate Gaussian random matrix 


D{X) = Xvv'^ + U, 


where U = U' £ is a Gaussian random matrix with independent zero-: 

Gaussian random variables with 

Varff/ ') = -a/n]/{nd) if {z,j} C Si or {i,j] C 52, 

\b[l — b/n\/{nd) if z £ 5i, j £ 52 or z £ 52, j £ 5i, 

and Uii = 0. By triangular inequality 

^-E$(/3,A;; AgVVd) - -E^{P,t,B) < -E$(/3, A:; AgVVd) - -E$(/3,fc;T)) 


(98) 


mean 


(99) 


n 


n 


n 


n 


+ 


-E4>(/3, k] D) - -E$(/3, t, B) 


( 100 ) 


The proof of Lemma 13.31 follows therefore from the next two results, which will be proved in 
the next subsections. 
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Lemma E.l. With the above definitions, if n> (15(i)^, then 


-¥.^il3,k-,AW/^fd) 

n 


-E$(/3,A:;L>) 

n 


2/3^ 

75 ■ 


( 101 ) 


Lemma E.2. With the above definitions, there exists an absolute constant uq such that, for all 
n > no. 


n 


-E<^(fi,k;D) 
n ^ ' 



( 102 ) 


E.l Proof of Lemma IE.II 

We use the following Lindeberg interpolation lemma, see e.g. |Tao12[[Ch^ . 

Lemma E.3. Let F : —)• M 6e three times continuously differentiable. Further, let X = 

{Xi,...,X]y) and Z = be two vectors of independent random variables, satisfying 

E{W} = K{Zi}, E{X?} = K{Zf} for each i G {1,... , A^}. Then, we have 

\E{F{X) - F{Z)} \ <^S3 mc^||9,3F||oo, (103) 

N 

53 = ^{e[|w7]+e[|z7]}. (104) 

i=l 

where d/F{x) = and \\d/F\\oo = sup 3 ,gK^ \d/F{x)\. 

We apply this to the function M i-t- $(/3, k] M) with N = n{n — l)/2, to compare the the two 
sets of independent random variables D = and M = where M = It 

is immediate to check the equality of the hrst two moments. Indeed 


and 


E{Dij} = E{Mij} 


(a - b)/{2nVd) if {i,j} C Si or {i,j} C S 2 , 

— (a — b)/ {2nVd) if i G S'!, j G S '2 or i G S '2 , J G , 


Var(Ai) = Var(M,,) 


a[l — a/n]/{nd) 
6[1 — b/n]/{nd) 


if {i,j} C Si or {i,j] C S 2 , 

if i G A ! J G 5*2 or i G A! J S A! 


(105) 


(106) 


Next we compute the partial derivatives of M 1 -^ ^{fi,k‘, M). To this end, it is convenient to 
define the following Gibbs probability measure over (S^”^)"’, which is naturally associated to the 
free energy <1: 


where 


Tm{o-) 


exp(/3ilM(o-)) 
f exp(/3BM(T))dn(T) ^ ^ 


= {a,Ma) = ^ . 


(107) 


(108) 
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(The same construction was useful in Section [A. 11 We repeat it here for the reader’s convenience.) 
It is then immediate to get (letting dij = ^g-): 

k] M) = HM{{o-i,crj)), (109) 

, ( 110 ) 

=l3‘^(^HMi{o-i,cTjf)-3nMi{o-i,(Tjf)nM{{cri,cTj)) + 2nMi{o-i,cTj)f'^ , ( 111 ) 

where we used the convention of letting denote the expectation of f{cr) with respect to 

the probability measure ^m- In particular, the above imply 


<6/3^ 

I V I loo — ^ 


( 112 ) 


We are finally left with the task of bounding the sum of third moments defined in Eq. (I104p . 
Note that Mij = (1 — {d/n))/y/d if (f,j) £ E{G) and Mij = —y/d/n otherwise. Hence, we have 


E{|Mi,f} < 


{a/n)d + (Vd/n)^ if {i,j} C 5 i or {i,j} C S2, 

{h/n)d~^^‘^ + {yfd/n)'^ if f £ 5 i, j £ ^2 or f £ S'2,j G S'!, 


Therefore 


S3= ^ E(|Dy|=‘)+ E E(|Myf) 


n 


< 


2 

2 X^ 

n 


A\3 


< —4E - + - 


n 


a\3/2 


re/ 




nd^/'^ 


+ 


re 


-n? 

+ T 


nd^/'^ 


+ 


re 


+ 4^1/2«3/2^ 


re 


< 5reV2a3/2 + 


re 


< 


2dV2 

2re 


+ 


d^ 

2re 


dV2 - ^ 


where the last two inequalities hold for re > (15d)^. 

Finally, using Lemma lE.31 with Eq. (11121) and the bound (|117p we obtain 


|E4>(/3,A:;M) -E4>(/3,A:;E))| < 


2/3^71 


(113) 

(114) 

(115) 

(116) 
(117) 


(118) 


which is the required claim. 
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E.2 Proof of Lemma IE.21 

This proof is by coupling. We first observe that (here the scalar product {cr,Mcr) is to be inter¬ 
preted as a product between matrices with a G 


$(/3, t,B) = ^ log I j exp {/3(cr, Ba)} di^{cr] 

= ^log{/ exp{l3{(T,Da) + P{{B - D),aa'^)} di^{(T] 


< 


Uif3,k-B)-Uif3,k-D) 

n n 


(119) 

( 120 ) 


^logjy exp{/3(cr,i:)cr) + n^\\B - D\\op] dv{a)^ 

< $(/3, k] B) + n\\B - D\\op , 

where we used ||<Tcr'''||* = ||<t|||, = n (with || • ||* denoting the nuclear norm). Hence 

< \\B — D\\op . 

In order to couple B and D se construct three independent symmetric Gaussian random matrices 
Zq, Zi, Z 2 G as follows. All of the three matrices have centered independent entries, differ 

in the variances. Setting v(a) = (a/(nd)) (1 — o/n), and v(b) = (b/(nd)) (1 — b/n), we let 


( 121 ) 

( 122 ) 

(123) 


Var(Zo,ij) = 

Var(Zi^ij) = 


v(b) if i / j, 

0 if* = j, 

v{a) - v{b) if {i,j} C Si or {i,j} C S 2 , and i / j, 
0 otherwise, 


and, finally. 


Var(Z2,ij) = 


(1/n) - v{b) if i ^ j, 
(1/n) if i = j. 


It is therefore easy to see that 


B — Xvv + Zq + Z 2 , 
D = Xvv^ + Zq + Zi. 

Hence using Eq. (I123p and triangular inequality 


-E^>(/3, k- B) - -E$(/3, k- D) < E||Zi ||,p + E||Z2||op 
n n 


< 2.I1/I — nv{b) + 2.1^n{v{a) — v{b))/2 

< 5i 


a — b 


d ’ 


(124) 

(125) 


(126) 


(127) 

(128) 


(129) 

(130) 

(131) 


where the last bounds hold for all n > ng by standard estimates on the eigenvalues of GOE matrices 
|AGZ09j . 
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F Proof of Theorem [5]. (a) (deformed GOE matrices, A < 1) 

In this section we prove part (a) of Theorem [5l We start with two usefui technicai facts, and then 
present the actuai proof. Throughout B{X) = (A/n) ll"'' + W, with W ~ GOE(n) is defined as 
per Eq. (fT8]l . 

F.l Two technical lemmas 

Lemma F.l. For any fixed W, the function A i—?■ SDP(S(A)) is monotone nondecreasing. 

Proof. Let Ai < A 2 and choose X* G PSDi(n) such that (S(Ai),X*) = SDP(S(Ai)) (this exists 


since PSDi (n) is compact). Then 

SDP{B{X2))>{B{X2),X,) (132) 

> (S(Ai) + (A2-Ai)lP/n,X*) (133) 

>SDP(S(Ai)), (134) 

where the last inequality follows since X* ^0. □ 

Lemma F.2. Fix 6 G (0,1] and k{n) = [ndj. Let U G he a uniformly random (Haar 

measure) orthogonal matrix (in particular U'^U = ^k{n))- Then there exists C = C{5) such that, 
for any fixed basis vector e^, 


P ( max 
V l<i<n 




> 1 - 


n 


1 

20 


(135) 


Proof. In order to lighten the notation, we can assume nd to be an integer. 

Let P = UU'^ be the orthogonal projector on the column space of U. By the invariance of 
the Haar measure under rotations, this is a projector onto a uniformly random subspace of nd 
dimension in M”, and T) = ||t/'''ej ||2 = {ei,Pei) = ||Pej|| 2 . Inverting the role of P and Cj, we see 
that Yii is distributed as the square norm of the first nd components of a uniformly random unit 
vector of n dimensions. Hence 

Y A 

^n5 T 



where ~ G {nd,n{l — h)} denote two independent chi-squared random variable with I 

degrees of freedom. Standard tail bounds on chi-squared random variables imply the claim. □ 


F.2 Proof of Theorem [51(a) 

We first note that 

-SDP{B{X)) < 6(^(A)) < 2 + On{l ), (137) 

n 

where the last inequality holds with high probability, by, e.g., |KY13] [Theorem 2.7]. 

It is therefore sufficient to prove that, for any e > 0, SDP(P(A))/ra > 2 — e with probability 
converging to one as n —>• 00 . By Lemma IF.H we only need to prove this for A = 0, i.e. to lower 
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bound SDP(Vr) for W I'NJ GOE(n). We will achieve this by constructing a witness, i.e. a feasible 
point X £ PSDi(n), depending on W such that {W,X)/n >2 — s with high probability. 

A more general construction will be developed in Appendix[G]to prove part (6) of the Theorem. 
The case A = 0 is however much simpler and we prefer to present it separately here to build 
intuition. 

Fix 6 > 0, and let ui, U 2 ,. ■ ■ u^s be the eigenvectors of W corresponding to the top n5 eigen¬ 
values. Denote by G TJ'^U = the matrix whose columns are ui, U 2 ,. ■ ■ and let 

D G be the diagonal matrix with entries 

Du = {UU^)u . (138) 


Note that, by invariance of the GOE distribution under orthogonal transformations, Lf is a uni¬ 
formly random orthogonal matrix. Hence by Lemma IF.21 and union bound 


max I Di 

ie[n] 


-<5| < C 


log n 


n 


^1- — 


(139) 


for C = C{5) a suitable constant. 
We then dehne our witness as 




(140) 

Glearly X G PSDi(iy) is 

a feasible point. Further, letting E = 


(w,x: 

) = 1 UU'^) -\{W - EWE, UU'^) 

d d 

(141) 


n5 

> --\\W - EWE\\2\\UU^\U 

^ e=i 

(142) 


> nU{w) - Iwwui + ||t;||2 )||£^ - ihwuu^iu . 

0 

(143) 


Here \\Z\\^ denotes the nuclear norm of bZ (sum of the absolute values of eigenvalues) and in the 
last inequality we used \\W — EWE \\2 < ||VE — E 1 LE ||2 + IjElVE — EWE\\ < ||VE|| 2 ||E 1 — I||2 + 
II Tl II2IILEII2 \\E — I|| 2 • 

Next , since UU'^ is a projector on nS dimensions, we have ||[/Ll'''||* = n5, whence 

-{W,X) >XUW)-\\W\\2{2+\\E-l\\2)\\E-l\\2. (144) 

n 

By Eq. (11391) . we have \\E — I ||2 —)• 0 almost surely, and by a classical result [AGZ09] . also the 
following limits hold almost surely 


lim IIVEII 2 = 2 , 

n^oo 

hm XnsiW) = C.{5), 

n—)-oo 


(145) 

(146) 


where ^*(5) t 2 as <5 —)• 0. Indeed ^*((5) can be expressed explicitly in terms of Wigner semicircle 
law, namely, for <5 G (0,1) it is the unique positive solution of the following equation. 


f 

JUS) 


V4^ 




2tt 


dx = 5. 


(147) 
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Substituting in Eq. (I144p . we get, almost surely (and as consequence in probability) 


lim inf - (W, X) > U6) > 2-£. (148) 

n^oo Tl 

where the last inequality holds by taking 5 small enough. 

G Proof of Theorem [5]. (6) (deformed GOE matrices, A > 1) 

We begin by recalling the definition of the deformed GOE matrix B = B{X), given in Eq. (I18p . 

B = -ll^ + W, (149) 

n 

where W ~ GOE(n), and we denote by (tii,^i), ..., {un,^n) denote the eigenpairs of B, namely 

BUjf; — ^k'^k ) ( 1 ^ 0 ) 


where 6 > 6 > ''' > 

The proof of Theorem[5j(6) is based on the following construction of a witness X, which depends 
on (small) parameters e, <5 > 0 to be fixed at the end. In order not to complicate the notation un¬ 
necessarily, we will assume n6 to be an integer. Let i? : M —)• M be a ‘capping’ function, i.e. 


R{x) 


'1 
< X 
-1 

\ 


if a: > 1, 
if —1 < X < 1, 
if X < — 1. 


(151) 


We then define (f € M"" by letting (pi = R{£y/nui^i). We also define U G as the matrix 

whose i-th column is lii+i (hence it contains the eigenvector U2, ■ ■ .Un5+i)- Note that U is an 
orthogonal matrix: U'^U = Einally, we define D G to be a diagonal matrix with entries 


Du = 




\U^ 


^ i \\2 


(152) 


Our witness construction is dehned as 


X = + DUU^D . 


(153) 


We analyze this construction through a sequence of lemmas. One of the proofs will use Lemma 
IG.51 to which we devote a separate section. Throughout we assume the above definitions and the 
setting of Theorem [5l We use C,Cq, ... to denote finite non-random universal constants. Without 
loss of generality, we will also assume A G (1, Go) for some Go > 1. 

We start from an elementary fact. 

Lemma G.l. There exists a constant C such that 


lim P(||S ||2 > G) = 0. 

n—^oo ^ 


(154) 
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Proof. It follows from triangular inequality that ||-B ||2 < A + ||W^|| 2 - Hence the claim follows by 
standard bounds on the eigenvalues of GOE matrices |AGZn9| [Theorem 2.1.22]. □ 


Lemma G.2. There exists a constant C > 0 such that, with high probability, 


n 


< Cs^ 


(155) 


Proof. Dehne x — R{x) — R[x). Further, for a vector x = (xi,X 2 , • • • ,Xn), we write R{x) for the 
vector obtained applying R componentwise, i.e. R{x) = {R{xi),R{x 2 ),... ,R{xn)). We then have 


n 




-{B, ipip^) - -{B, {ey/nui){ey/nui)'^) 
n n 

2 _ 1 _ _ 

< —\{{ey/nui),B R{ey/nui))\ H— \{R{ey/nui),B R{£y/nui))\ 


n 


n 


< 


1 


4 ||S|| 2 -^||i?(ev^tii )||2 max ; -^\\R{e^/nUl)\\.^ 


1 


Note that 


R{xf = 


(jxj — 1)^ if jxj > 1, 
0 if |x| < 1. 


In particular R{x)'^ < x^ for all x. We therefore have 

1 — 2 I” — 

-||i?(e^/ntii)|| = R{£^/nulA) 

n n ^^ 

1 = 1 

- — 


(156) 

(157) 

(158) 

(159) 


(160) 

(161) 


2=1 


Next we decompose ui = 2 ;i(l/-yn) + y^l — zf uj^, where zi = l)\/^/n G [0,1], and 1) = 

0. Since (a + 6)® < 2^ (a® + 6®), we have 


^\\R{e^/flui)\\l < ^y^32(l + (V^u^J®) 


2 = 1 


< 32e® 


1 + -'^iVnuii) 
n ’ 


i=l 


< Ce^, 


(162) 

(163) 


where the last inequality holds with high probability for some absolute constant C and all n > uq, 
by Lemma lG.51 below, applied with a = 6, 6 = 0. Using this together with Eq. (|154l) in Eq. (I158p 
we get 


-{B,pp 

n 





< Ce^ max(e; Ce^) < C'e '^, 


(164) 


which completes our proof. 


□ 
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Lemma G.3. Let F S be a diagonal matrix with entries Fa = y 1 — ipf. Then, there exists 

a constant K = K(5) such that, with high probability, 


-{B,DUU^D) 

n 


— iB,FUU'^F) 

nd 


< K{6) 



(165) 


Proof. Define iL to be a diagonal matrix with entries Ha = \f5ei\\ 2 . Then by definition 
D = FH/V5 and 


-{B,DUU'^D) 

n 


— iB,FUU'^F) 

no 


= — \{FBF,HUU^H) - {FBF,UU'^) 

no 

< -^Whbh - B\\JUU'^\U , 

no" 


(166) 

(167) 


where B = FBF, and we recall that IliWH* denotes the nuclear norm of matrix M. Note that 
||F ||2 = maxjgj„] \Fii\ < 1, hence by Eq. (I154p we have ||-B ||2 < C with high probability. Further, 
since UU'^ is a projector on a space of n6 dimensions, we have = n6. Therefore 


-{B,DUU'^D) 

n 


nd 


{B,FUU^F) 


< \\HBH -B\ 


< ||.B||2||i/-I|| {2 + \\H-1\\2) 

< C'||iT-I||max(l ; ||iT-I|| 2 ) , 


(168) 

(169) 

(170) 


where we used \\B II 2 < 11511211^111 < ||5||2 < C by Lemma[GTl Note that 


\H — III 2 = max 

l<2<n 




It/T 


- 1 


^ i \\2 


The proof is completed by Lemma lF.21 and union bound. 


(171) 

□ 


Lemma G.4. There exists a finite constant C > 0 such that, for all 6,£ > 0, we have 

lim FBFui) > L{e, 5) Vi G {2,..., n(5 + I}) = 1, (172) 

n^oo \ J 

L{£, 5) = 2-2£‘^ - - Ce^. (173) 

The proof of this lemma is longer that the others, and deferred to Section [G.21 
We are now in position to prove Theorem [5l( 6). 


Proof of Theoreml^{b) . We use the explicit construction in Eq. (jl53p . Note that X € PSDi(n). 
Indeed W ^ 0 as it is the sum of two positive-semidefinite matrices. Further, Xu = 1, since 

{ei,Xe,) = \{ei,^)\^ + \\u'^Dei\\l (174) 

= ipf + Dl\\U^e,\\l = l. (175) 


We are left with the task of lower bounding the objective value. With high probability 


-{B,X) = -{B, + -{B, DUU^D) 

n n n 

> -Ce^ + \{B, FUU^F) - K{5) 

no \ n 


(176) 

(177) 
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where we used Lemma IG.21 and Lemma IG.3[ For all n large enough, we can bound the term 
(logn)/n^^^ by Ce^. Fnrther, by |KY 13j [Theorem 2.7], > (A + A“^) — with high 

probability. Since A + A“^ > 2, there exists Ao(A) > 0 snch that, with high probability 


.. .. 71 ( 5+1 

-{B, X) > (2 + Ao(A))e2 _ {u,, FBEm). (178) 

n no 

i=2 

Now we apply Lemma IG.41 to get, with high probability 

-iB, X)>{2 + Ao(A))e2 -Ce^ + 2- 26^ _ C5^/^ - Ce^ (179) 

n 

>2 + Ao(A)e2-2Ce^-C(52/3. (180) 

Setting e = y^Ao(A)/( 4C') and S = [Ao(A)/(16C^)]^/^, we conclude that 

hm p('i(B,X) >2 + ^^) =1, (181) 

n->oo yn loC / 

which completes the proof of the theorem. □ 


G.l A law of large numbers for the eigenvectors of deformed Wigner matrices 

In this section we establish a lemma that will be used repeatedly in the proof of Lemma IG.4I 

Lemma G.5. Fix i G {2,... ,n} and let uj-, be the projections of eigenvectors ui, Ui of B 
orthogonal to 1 (explicitly, = u — {l,u)l/n for u G {ui,Ui}). For any a, 6 G N, and t,C ^ ]R>o 
there exists uq = no(a, b,t, C) < oo such that, for all n > no 

P 


n 


- marrib 


k=l 


> t 


^ — 


(182) 


where nia = E{Z“}, for Z ^ N(0,1). 

Proof. Thronghont the proof, we let v = Ify/n. Note that the law of the random matrix B is 
invariant under transformations that leave v nnchanged. namely, if i? G is an orthogonal 

matrix such that Rv = v or Rv = —v, then 


RBR^ = B . 


(183) 


It follows that the joint law of uj^, uf- is left invariant by snch a transformation. Formally 

{Rui,Ru(-) = Hence, the pair is a uniformly random orthonormal pair, in 

the subspace orthogonal to v (invariance nnder rotations characterizes this distribution uniquely). 
Hereafter, we’ll set i = 2 without loss of generality. 

We can construct the pair by generating i.i.d. vectors gi,g 2 ~ N(0,ln), and then applying 
Gram-Schmidt procedure to the triple {v,gi,g 2 ). Explicitly 


u 


± 

1 


u 


2 


gi - {gi,v)v 
gi - ’ 

ff 2 - {g 2 ,v)v - (g 2 ,u(-}uj- 
g 2 - {g 2 ,v)v - {g 2 ,uj;)uj ;\\2 ' 


(184) 

(185) 
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We then have 


n 


U, 


a,b 


k=l 


jjal 2 jjbl 2 ’ 

^2,0 ^0,2 


Ua,b = - X] (^hfc - (fl'iW)^fc)“(52,fc - {92,v)vk - {g2,uj;)ui^ky 


k=l 


(186) 

(187) 


We claim that, with the same notations as in the statement of the lemma, 

^{\Ua,b - rnanibl > t} < ^ , (188) 

for n > no{a,b,t,C). Once this claim is proved, the lemma follows by the representation (I186p 
using union bound over the three random variables Ua,b, U 2 ,o, t7o,2) since m 2 = 1 (and eventually 
increasing no). 

In order to prove the claim (I188p . we expand the powers in Ea. (ll87l) . to get: 


Ua,b — Uafi{0) + E E ^a,b{J'li ^ 2 ) ^ 3 ) Ua,b{J'li ^ 2 : ^s) l(i+Z 2 +^ 3>0 '^h+ls^b j 

0<h<a0<l2,l3<b 

1 
n 


\k92,k 


Ua,b{0) — ^ 

k=l 

1 1 ^ 

Ua,b{h,l2,h) ^ {gi,vy^ {92, {92, uiY^ (- ^ (u^fe)^^ 


k=l 


(189) 

(190) 

(191) 


where Ka^bih,h,h) are combinatorial factors (bounded as |77a,6(^ij ^ 2 ) ^ 3 )! < 2“3^). Consider first 
the term 17a,6(0). By definition E{17a,6(0)} = mamb- Further, by Markov inequality. 


’{|t7a,6(0) - mamb\ >t}< 


1 


1 


E 


EVi 

.i=l 


2 i 


< 


n^Co{a,b,e) < 


n 


C 


(192) 

(193) 


where Co is a combinatorial factor, and last inequality holds for any C, provided n > no (a, b, t, C). 

Consider next any of the terms Ua^ihihih)- Note that {gi,v), { 92 ,v), {g^,u^) ~ N(0,1) (but 
not independent). By Gaussian tail bounds, P(|(gfi,u)| > aVlogn) < for all n large enough. 

By a union bound 


^{\{ 9 i,v)f^\{g 2 ,v)\^^\{g 2 ,ui)\^^ > (logn)“+'>} < ^, (194) 

for all C > 0, provided n > no(C). Proceeding analogously, and using the construction (11841) . we 
get for all n > no(C), 



( 195 ) 
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Finally, using these probability bounds in Eq. (I19ip . we get, with probability at least 1 — 2n 

1/2 


1 


k=l 


- ^(q+Z2+Z3)/2 nT^^^U2(a-h),2ib-h-h) (0) 


a/2 


(196) 

(197) 


Hence, using Eq. (11891) and the bound (I193p applied to U 2 [a-h), 2 {b-i 2 -h){^)^ obtain (since 

+ ^2 + ^3 ^ 1 ) 


^{\Ua,b-Ua,b{0)\ > 


(log n)"’*'^ ^ 1 


n 


1/2 


- 


(198) 


for all C > 0 and all n > mQ{a, b, t, C). Applying again Eq. (I193p to Uafi{0), we obtain the desired 
bound, Eq. (|188D . which finishes the proof. □ 


G.2 Proof of Lemma IG.4I 

We begin with a technical lemma. 


Lemma G.6. Fix i G {2,..., n} and let Ui be the i-th eigenvector of the deformed GOE matrix 
B. Let V = Ify/n. 

Then, for any rj > 0 there exists no = no(??) (independent of i) such that, for all n > no{r]) 


v,Ui)\ > r/l < . 


^10 

Proof. Consider the eigenvalue equation Bui = or, equivalently, 

X{v,Ui)v + Wui = iiUi. 

Solving for Ui and then using ||ui||| = 1, we get the equation 


1 = \^{ui,v)'^ v) 


(199) 


( 200 ) 


( 201 ) 


Since, by assumption A > 1, it is sufficient to prove that, for any M > 0, (u, (^J — VF) > M 
with probability at least 1 — n“i® provided n > nQ{M). 

In order to prove this fact, let (Co,i) "fio,!)) ■ ■ ■ j (Co,n)'ao,n), be the eigenpairs of W, and notice 
that, by the interlacing inequality Further assume i G {2,... ,n/2} (the proof 

proceeds analogously in the other case). Then, hxing cr > 0 a small number, we have 


^ [fi - ?0,fc) 

(202) 

i-\-ncr 1 / \ 12 

~ “ ^ 0 ,i+na)‘^ 

(203) 

> (C e 

— Z 0 ,i+na) 

(204) 
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where, for notational simplicity, we assumed na to be an integer, and Uq G is a matrix 

whose columns are the eigenvectors wo,i+i) ■ ■ ■ 

Note that, by invariance of W ~ GOE(n) under rotations Uq is a uniformly random orthogonal 
matrix with the assigned dimension. By Lemma lF.21 implies for all n > ni{a), 


U^v\\l>^-r \ > 1 - 


a 


n 


20 ■ 


(205) 


For /c G {1,..., n}, let be the unique solution in (—2, 2) of 

r-2 


/ 

hk 


-\/4 — 


dx = 


k 


n 


(206) 


Then, concentration of the eigenvalues of Wigner matrices |AGZ09] [Theorem 2.3.5], together with 
the convergence to the semicircle law, implies, for all n > n 2 {cr), and letting j = i + na, 




(207) 


Further, by definition. 


fii y/4 — 

a = -dx 

k 27r 


> 


j: 






27r 


dx 




(208) 

(209) 

( 210 ) 


with Co a numerical constant. Using this bound together with the concentration bound (I207p we 
get, for all a small enough, and all n > n 2 {a) 


^\^i-C^+na\<Cla^/^) >1-^. ( 211 ) 

Using this inequality together with Fq. (|205p in Fq. (|204p . we get 

p((^, (CJ - w)-\) > C2u-i/3^ ^ 1 ’ ( 212 ) 

which implies the claim of the Lemma, by taking a a small enough constant. □ 

Define to be the projector orthogonal to the space spanned by {ui,Ui}. The following 
Lemma bounds the contribution of this space. 

Lemma G.7. Recall that F G denotes the diagonal matrix with entries Fa = — ^1- 

Then, there exists constants C > 0, and no = no(e) such that, for all i G {2,... ,nd + 1}, and all 
n > no(e), we have 

¥{\\P,^,Fui\\2>Ce^) <^. (213) 
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Proof of Lemma G.l. We decompose Ui as 


Ui = Zi—^ + Jl - zf uj- 

V n 


(214) 


where Zi = \{ui,l/y/n)\ € [0,1] and = 0 (note that we can assume > 0 by eventually 

flipping Ui). Since jji^ — I ||2 = maxi<j<„ — 1| < 1, and P-tiUi = 0, we have 


\\P^^iFUi\\2 = \\P^i{F - l)Ui\\2 

< z, \\P,^,iF - 1)1/V^h + ^l-zf\\P/;i{F - I )^^||2 

< Zi + ||(F-I)'u,^||2. 

From Lemma IG.6[ there exists a constant ni = ni(e) such that, for all n > ni(e) 

For the second contribution in Eq. (j217p we use 

" /- 

II (F - I)w,^||^ = 


k=l 


(a) 


k=l 

(b) ^ 

(c) 


< 


k=l 

V 


(y/nui^k)^ 


1/2 / n \ 1/2 


k=l 


n 


(215) 

(216) 

(217) 

(218) 

(219) 

( 220 ) 
( 221 ) 

( 222 ) 


k=l 


where inequality (a) follows from 1 — y/\ — t <tiort£ [0,1], inequality (6) from R{x)‘^ < and 
(c) from Cauchy-Schwartz. 

We next bound with high probability each term on the right hand side in Eq. (I222p . In the 
following, we let r; = 1 /y/ri. Let us start with the second term. By applying Lemma IG.51 with 
a = 0, 6 = 4, we find that, for all n > hq (with no an absolute constant) 


n 




k=l 


(223) 


Gonsider next the first term on the right-hand side of Eq. ()222p . We have Ui = zi t^-l-y^l — zfu/^, 
where zi = |(ni,t7)| G [0,1], and - again- is orthogonal to v. By triangular inequality, we have 
ll'i^ills < 2;i||r;||8 -|- y^l — 2;^||M;j'“||8 < n“^/® -|- ||n]*-||8, and therefore 


1 _ ^ _ 1 98 _ ^ _ 

- (v^'^i.fc)^ < 128 H- {Vnuik) 


(224) 


fc=i 


k=l 
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Using this bound together with Lemma lG.51 (with o = 8, 6 = 0) we find that, for all n > uq (with 
no an absolute constant) 


1 1 
> lOOo) < ^ 


(225) 


k=l 


Using Eqs. (12231) and (j225l) in Eq. (I222p . we get, of all n large enough and some constant C, 


p(ll(F-iKll2>feq <i, 


(226) 


Using this in Eq. (|217p . together with Eq (12181) . we obtain the desired claim. □ 

The next lemma controls the effect of F along Uj. 

Lemma G.8. There exists constants C > 0, and no = no(e) such that, for all i G {2,... ,n}, and 
all n > no(e), we have 


P((ni, Fui) > - Ce^) > 1 - 


C 




(227) 


Proof of Lemnia \G.8[ Throughout the proof, we let v = Xj^fn. We decompose = ZiV + 
where Zi = \{v,Ui)\ G [0,1] and (i7,«^) = 0 (note that we can always assume 

{ui,v) > 0 by eventually flipping Ui). Since F is diagonal with Fa = -^1 — ipj, we have ||E '||2 = 
maxi<j<„ \Fii\ < 1, and F PO. Therefore 

{ui, Fui) = zf{v, Fv) + 2zi^Jl - zflv, Fu^) + (1 - zf)(n^, (228) 

> {ui,Fui) - 2zi - zf (229) 

> {ui,Fuf)-^Zi, (230) 

It follows from Lemma IG.61 that Zi < e^/3 with probability at least 1 — for all n large enough, 
and any fixed i >2. Therefore, for all n > nQ(e), we have that 


Ui,Fui) > {uf,Fuf) - e^] > 1 - 


n 


10 • 


We are now left with the task of lower bounding {uf, Fuj^). By definition, we have 
{ui, Fui) = - V 1 - {.yfnuikf 


k=l 


(a) 1 




,-L 


k=l 
2 ^ 


k=l 


-4 




k=l 


n 


(231) 

(232) 

(233) 

(234) 


k=l 


where inequality we (a) follows since \Jl — x > 1 — {x/2) — 2x^ for x G [0,1], and {b) because 

|72(x)[ < X. 
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We next consider each of the sums on the right-hand side of Eq. (|234l) . These take the form 




(235) 


k=l 


where q = 2 (for the first sum) or q = 4 (for the second). Using this notation, we have 

> 1 - - 26 “^ <54 . 


(236) 


The term ^4 has been already dealt with in the proof of Lemma IG.71 see Eq. ()22ip . By the same 
derivation, we conclude that there exists an absolute constant C such that 


P(.54 > C) < 


1 


w- 


(237) 


for all n > no. 

Next consider S 2 - We decompose ui = ziv + ^/l — zf uj- where zi = |(tii, r;)| and {uj^, v) = 0. 
Expanding the square, and using Vk = ^jy/n, we get 

^2 = ^ ^ + 2zi^l-zl i ^ {Vnuf^kf + ^ ^ . 

(238) 


k=l 


k=l 


k=l 


Because of the invariance of the GOE distribution under orthogonal transformations, the pair 
{uj-,u:l~} is a uniformly random orthonormal pair, orthogonal to v. Eurther, it is independent of 
zi- By applying Lemma fG. 5 1 we obtain that, for all t > 0 and all n > no{t) 


1 

n 


k=l 


n 


k=l 


n 


k=l 


, \ 

1 

(239) 

'> 1 + t 1 


tkf > 

1 

(240) 



(241) 


Using these in Eq. (I238p together with zi € [0,1], we get 


P(.52 > 1 + t) < 


n“ 


(242) 


for all n > no{t). Using this together with Eq. (j237p in Eq. (j236p (with t = e'^), we obtain that 
there exists an absolute constant C > such that, for all n > reo(e) 


1 


uf,Fuf) > 1 - -£2 - ) > 1 - — 


1 


n' 


The claim (|227p follows since 1 — /2 > y/l — for e G [0,1], and using Eq. (12311) . 


(243) 


□ 
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We are now in position to prove Lemma lG.4[ 


Proof of Lemma G.4 Fix z G {2, ..., n5 + 1}. We claim that (uj, FBFui) > 2 — 2e^ — — Ce'^ 

holds with probability larger than 1 — Cjr?. In order to prove this, note that 


{ui,FBFui) = f,i{ui,Fui)‘^ + f,i{ui,Fui)‘^ + (244) 

> f,i{ui,Fui)‘^ + f,i{ui,Fui)‘^ + . (245) 

Let ^*((5) be defined as in the previous section, namely as the unique positive solution of Eq. (11471) . 
(In particular, ^*(<5) > 2 — C5‘^/^.) Note that by [KY 13j [Theorem 2.7], we have, for all n large 
enough 

P(£) > 1 - , (246) 

8 = [b -. > a + A-^ - n-°•^ en5+i > U&) - Cn > -2 - n-O'^} (247) 

On the event 8, we have, by Eq. (|245p . 

{ui,FBFui) > (A + A-^ - n-^-^){u„Fuif + (U^) - Fm)^ - (2 + n-^-^)\\P,^iFu,\\l 

(248) 

> (2 - C<52/3 _ n-^-^){ui,Fuif - 3\\P^^,Fu,\\l . (249) 

Using Eq. ()246p . Lemma IG.71 and Lemma IG.81 we obtain, for all n > no(e) 

F(^{u„FBFui) > (2 - _ ^-0.4^ (i - - Ce^) - ^ . (250) 

The lemma follows by adjusting the constant C, and union bound over i G {2,..., nd + 1}. □ 


H Proof of Theorem [6] (estimation) 

H.l A rounding lemma 

We will need the following rounding lemma, that is of independent interest. While we state it for 
general expectations of random variables, we will apply it to finite sums (i.e. expectations with 
respect to random variables that take finitely many values). 

Lemma H.l. For t G M>o, define sj : M —)• {+1,0,—!} by st{x) = 1 if x > t, st{x) = —1 if 
X < —t, and st{x) = 0 otherwise. 

Let Xq, Y be two random variables with P(Yo = +1) = P(Yo = —1) = 1/2, E(XoY) > e > 0 
and E(Y^) = 1. Then, there exists t* (repending on the joint law of Xq,Y) such that 

E{AoSt,(Y)} > ^. (251) 
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Proof. Define Z = XqY. Then the assumptions translate into E(Z) > e and E(Z^) = 1, while the 


claim is equivalent to E,{st{Z)} > e^/4 (note indeed that st{ ■) is an odd function). Now we have 

roo 

£ < E(Z) = / \¥{Z >t)- ¥{Z < -t)] dt (252) 

Jo 

pT 1 poo 

< / [P(Z >t)- P(Z < -t)] dt+- t [P{Z >t)+ F{Z < -t)] dt (253) 

Jo T Jrp 

< K{st{Z)}dt + ^E{Z^}. (254) 

Taking T = 2/e, it follows that 

^l\{st{Z)}dt>"^. (255) 

Since the average of E{st{Z)f over the interval t S [0, T] is at least e^/4, then there must exists 

t* G [0,T] such that E{st^{Z)} > e^/4. □ 


H.2 Proof of Theorem [5] 

Throughout this appendix, the partition V = 5i U S 2 is fixed. Note that Gi ~ G{n,a'jn,h'jn), 
and G 2 ~ G{n,a'6n/n,b'6n/n), with a' = a/(l + 6n), b' = 6/(1 + 6n). For simplicity of notation, 
we will use a instead of a’ and b instead of b'. Note that this does not change the assumptions 
because it only implies a 0^(1) shift in a, b. Also, Gi and G 2 are dependent because they cannot 
share edges. However, if they are sampled independently, they will share, with high probability, 
only 0(1) edges. We will therefeore treat them as independent: the incurred error is negligible. 
Setting, by definition, the diagonal entries of Ag" to be equal to A\/d, we have 

—— = —xqXo^ + E , (256) 

VO ^ 

where E = E'^ has zero mean, E{£1} = 0, with Ea = 0, and {Eij)i^j independent 


Eij — 


VdV 

Yd 


with probability pij , 
with probability 1 — pij. 


(257) 


Here pij = a/n if {i,j} C Si or {i,j} C S 2 , and pij = b/n otherwise. 

Proceeding exactly as in the proof of Theorem [HI we can compare the SDP value for the matrix 
E, to the SDP value for a Gaussian matrix. We obtain the following estimate, whose proof we 
omit. 


Lemma H.2. Let E G be the random matrix defined above, with d = {a + b)/2, and A = 

{a — h)/-sj2{a + b). Let W ~ GOE(n) he a Gaussian random matrix with (VFjj)i<j N(0,1/n). 

Then, there exists G = G(A) such that, with probability at least 1 — G for all n > no{a,b) 


-SDP(L;) - isDP(LF) 

n n 


. Clogd 
- rfi/io ’ 


(258) 


Further G{X) is bounded over compact intervals A G [0, Amax] 
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As a consequence of this lemma, and of Theorem [5l we have 


-SDP(E) <2 + 
n 


C\ogd 


(259) 


with probability at least 1 — C 

Consider then a maximizer of the SDP Q, with M = We have, by Theorem [5] and 

Theorem [5] (or, equivalently, by Theorem [3]) 


4(®o®o'^, + -(-E, X*) = ^SDP(A^^") > 2 + A(e), (260) 

n nVd 

for all d > (i*(e), with probability at least 1 — . Using the bound (I259p . this implies, for A 

bounded and some A 2 (e) > 0, 


n 


1 J ^ 1 

= ^{xoXo^,X^) > A2(e) 


2=1 


(261) 


Since X* G PSDi(n), we have > 0 and Hence there exists I* G [n] such that 

^ |(a;o,W*)| > \/A2(e). 


(262) 


Assume, without loss of generality, that {xq,vj^) > 0. Applying Lemma [H.ll to the pair (Xo,T) 
with joint distribution n~^ Sj=i ^a:o conclude that there exists U G M such that 

l{xo,suM)>^. (263) 

n 4 

(Here Si,(•) is understood to be applied componentwise.) Note that st^{vj^) = fQj. gome 

index J* G [n], whence 

(264) 

n 4 

In other words, at least one of the estimators : i,jG [n]} has a good correlation with the 

ground truth. We are left to prove that step (iv) in our algorithm does indeed select such a pair of 
indices i,j. This follows from the following simple concentration lemma. 

Lemma H.3. There exists a constant C = C{a,b) bounded for a, b in bounded intervals, such that, 
for all s G [0,1]. 


max 

i,je[n] 


{x 


ihj) 




+ 


\y/d 


> sJn > < C e 


— y/ns^jc 


(265) 


Proof. Throughout the proof, C denotes a constant that might depend on a, b, bounded for a, b in 
compact intervals. For any fixed vector x G {+1,0, —1}"' we have, by Azuma-Hoeffding inequality 

P||(£, -E(®, Ag“®)| > s ; 1 ^ 2 ! < m 2 } < . (266) 
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However, by ChernofF bound, m 2 < Cy/n with probability at least 1 — Ce ^ whence, for all 

s G [0,1], 

P| I (a, AZlx) - ]E(a, I > s^A^} < C . (267) 

On the other hand EAg^ = (a — 6 )(a;oa;o''' ~ I)/(2?^), whence 

XVd , XVd 


The claim follows by taking union bound over i,j G [n], since x^'^X) jg independent of Gi. 
It follows from the last lemma, and Eq. (I264p that 

- max = A 3 (e), 

Tl ijG[n] 54 

with probability at least 1 — Hence, again by the last lemma 

max{(*(*’^), : (i,j) G [n]^\{x^^’^\xo)\ < ^ 


(268) 

□ 

(269) 


(270) 


with probability at least 1 — Ce The claim follows since on the events (j269p and ^2701) we 

necessarily have | a;o)| > nA 2 (e)/ 8 . 

I Proof of Corollary 14.11 (robustness) 

Recall that Ag“ = Ac — {d/n)ll^ denotes the centered adjacency matrix. If G and G differ in 
one edge, then |SDP(Aq“) — SDP(A~“)| < 1: a complete proof of this simple fact is given in the 
proof of Lemma IA.2I below. The claim then follows immediately since (using the coupling in the 
statement) |SDP(Ag“) — SDP(A~“)| = o(n) with high probability. 

J Proof of Theorem [7] (testing r > 2 communities) 

The proof is very similar to the one of Theorem [3l and we therefore limit ourself to an outline 
emphasizing the main differences. Throughout the proof we set 


d = ^[a + (r — 1 ) 6 ] , 
a — b a — b 


X = 


ry/d \/r{a + (r — 1 ) 6 ) 


> 1 + e. 


(271) 

(272) 


Further, without loss of generality, we can assume A G [0, Amax] with Amax > 1 fixed. Also, the 
concentration lemma lA.21 applies unchanged to SDP(Ag") for G ~ Gr{n,a/n,b/n). It is therefore 
sufficient to check that the error probability vanishes as n ^ 00 . The exponentially decaying error 
rate follows. 
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Consider first the probability of a false positive (i.e. declaring that r communities are present 
when G ~ G{n,d/n)). As for Theorem[3l we have 

lim Po(rAG;(5) = l) = lim Pof-SDP(Ag“) > 2(1 + 6)Vd] = 0. (273) 

where the last equality holds for any d > do(<5) by Theorem [H 

We are then left with the task of proving that the probability of false negatives vanishes. This 
follows the same steps as for Theorem [3l Namely: (i) We approximate the value of SDP(Aq“) for 
G ~ Gr(re, a/re, 6/n) by the value of the SDP for a suitable deformed GOE model; (m) We analyze 
the deformed GOE model. 

The relevant deformed GOE random matrix is defined as follows. Let So(r) S be given 

by 




(r — l)/re if {i,j} ^ Si for some £ G [r], 
—1/re otherwise. 


(274) 


Note that Bo(^r) has rank (r — 1), and all of its non-zero eigenvalues are equal to Bq = 1. Hence 
Bo = VkvJ., for Di,..., v^_i £ an orthonormal set. We then let 

B{\r) = \Bo{r) + W, (275) 


with W ~ GOE(re). 

We are now in position to state an analogue of the approximation theorem [8l 

Theorem 9. Let G ~ Gd{n,a/n,b/n), d = (a + (r — l)6)/r, and Ag" = Aq — (d/re)!!''' be its 
centered adjacency matrix. Let A = {a — b)/{rVd) and define B = B(^X,r) to be the deformed GOE 
matrix in Eq. ^275^ . Then, there exists G = C{\,r) such that, with probability at least 1 — G e~'^l^, 
for all re > reo(a, b, r) 


^SDP(Ag'‘)-isDP(S(A,r)) 

nVd n 

^SDP(-Ag") - -SDP(-B(A,r)) 
nVd re 


. Glogd 

- dVio ’ 
^ Clogd 

- dVio ■ 


(276) 

(277) 


Eurther G{X,r) is bounded over compact intervals X G [0, Amax] 

The proof of this theorem is exactly equal to the one of Theorem (i) We introduce a rank- 
constrained version of the above SDP, and bond the error using the Grothendieck-type inequality 
of Theorem [H (ii) We introduce a ‘finite-temperature’ smoothing of this optimization problem, 
and bound the error using Lemma 13.21 [Hi) We use Lindeberg method as in Lemma 13.31 to replace 
the centered adjacency matrix Ag" by the Gaussian model B[X,r). We will omit further details 
of this proof. 

We then analyze the model B{X, r), and establish the following analogue of Theorem [5l 

Theorem 10. Let B = B{X,r) G be a symmetric matrix distributed according to the model 

(F/Sf l, r > 2. 

//A > 1, then there exists A(A,r) > 0 such that SDP(S(A, r))/re >2 + A{X,r) with probability 
converging to one as re —)• oo. 
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The proof of this result is very similar to the one of Theorem [5l We outline the main differences 
in Section ixn 

Armed with these theorems, we can now lower bound SDP(Ag") for G ~ Gr{n,a/n,b/n). 
Namely, for A > 1 + e we have, with high probability, 

^SDP(Ag") > -SDP(B(A, r)) - ^A(1 + e, r) (278) 

nVd n 4 

> isDP(S(l + e,r))--A(l + e,r) (279) 

n 4 

>2 + ^A(l + e,r). (280) 

We then conclude selecting 5*(e) = A(1 + e)/2 >0, as in the proof of Theorem [5l see Eq. (l57|) . 


J.l Proof outline for Theorem 1101 

Throughout this section B = B(A, r) with A > 1 + e and r > 2 is defined as per Eq. (|275p . 

As for the proof of Theorem O the proof consists in constructing a suitable witness X S 
PSDi(n), and then lower bounding the value {B,X). We describe here the witness construction 
since the lower bound on {B, X) is analogous to the one in the case r = 2. 

Denote by (rti,^i), ..., (u„,^„) denote the eigenpairs of B, namely 

Buk = ikUk , (281) 


where > ^2 > • • • > Our construction depends on parameters e,5 > 0. Let V G be 

the matrix whose i-th column is the eigenvector Uj(and hence containing eigenvectors , Ur-i), 

and U € be the matrix whose i-th column is eigenvector Ur+i-i (and hence containing 

eigenvectors ..., Ur+nS-i)- 

Define, with an abuse of notation R : —)• as follows 


R{x) 


X if ||te||2 < 1, 

®/||®||2 otherwise, 


(282) 


and define 'I' G as T' = R{ey/nV) where i2( •) is understood to be applied row-by-row to 

£y/nV G Equivalently, for each i G [n], we have 


^^ei = Rie^V^ei). 

(283) 

We finally define a diagonal matrix D G with entries 



(284) 

and construct the witness by setting 


X = -h DUU'^D . 

(285) 


We have X G PSDi(n) by construction. The proof that, with high probability, {B,X)/n > 
2 -|- A(A, r) follows the same steps as for the case r = 2, detailed in Appendix lUl 
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